Systems and methods for discovery and analysis of markers

ABSTRACT

A business method for use in classifying patient samples. The method includes steps of collecting case samples representing a clinical phenotypic state and control samples representing patients without said clinical phenotypic state. Preferably the system uses a mass spectrometry platform system to identify patterns of polypeptides in said case samples and in the control samples without regard to the specific identity of at least some of said polypeptides. Based on identified representative patterns of the state, the business method provides for the marketing of diagnostic products using representative patterns. The present invention relates to systems and methods for identifying new markers, diagnosing patients with a biological state of interest, and marketing/commercializing such diagnostics. The present invention relates to systems and methods of greater sensitivity, specificity, and/or cost effectiveness.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. application Ser. No.13/018,622 which is a Divisional of U.S. application Ser. No. 12/172,988filed Jul. 14, 2008 now U.S. Pat. No. 7,906,758, which is a Continuationof U.S. application Ser. No. 11/178,262, filed Jul. 8, 2005 now U.S.Pat. No. 7,425,700, which is a Continuation in Part of U.S. patentapplication Ser. No. 10/760,100, filed Jan. 16, 2004, which is aContinuation in Part of U.S. application Ser. No. 10/645,863, filed Aug.20, 2003, which claims priority to U.S. Provisional Application No.60/473,272, filed May 22, 2003, each of which is incorporated herein byreference for all purposes. This application is also related to U.S.application Ser. No. 11/178,245, entitled “BIOLOGICAL PATTERNS FORDIAGNOSIS AND TREATMENT OF CANCER”, filed Jul. 8, 2005, which isincorporated herein by reference for all purposes.

BACKGROUND OF THE INVENTION

The present inventions provide a business system and method forpharmaceutical, diagnostic, and biological research as well asapplications of such research. Additionally, the present inventionsprovide a system for creation of assays such as assays based on the useof mass spectrometry.

A common aspect of all life on earth is the use of polypeptides asfunctional building blocks and the encryption of the instructions forthe building blocks in the blueprint of nucleic acids (DNA, RNA). Whatdistinguishes between living entities lies in the instructions encodedin the nucleic acids of the genome and the way the genome manifestsitself in response to the environment as proteins. The complement ofproteins, protein fragments, and peptides present at any specific momentin time defines who and what we are at that moment, as well as our stateof health or disease.

One of the greatest challenges facing biomedical research and medicineis the limited ability to distinguish between specific biological statesor conditions that affect an organism. This is reflected in the limitedability to detect the earliest stages of disease, anticipate the pathany apparent disease may or will take in one patient versus another,predict the likelihood of response for any individual to a particulartreatment, and preempt the possible adverse affects of treatments on aparticular individual.

New technologies and strategies are needed to inform medical care andimprove the repertoire of medical tools, as well as methods or businessmethods to utilize such technologies and strategies.

BRIEF SUMMARY OF THE INVENTION

According to one aspect, the present invention relates to systemscomprising: a mass spectrometer; and a microfluidic device adapted forsample separation, wherein said microfluidic device has a electrosprayionization interface to said mass spectrometer. In some embodiments, thesystem above has a microfluidic device that is disposable and/or iscomposed of a polymeric material. In some embodiments, the system abovehas a microfluidic device adapted to reduce the amount of one or moreabundant proteins from a sample or to remove sample components that aregreater than 50 kD. Removal of abundant protein(s) or of componentsgreater than 50 kD can be carried out using various devices, such as 96well plates.

In any of the embodiments herein, a sample can be a fluid sample ornon-fluid sample. Fluid samples include, but are not limited to serum,plasma, whole blood, nipple aspirate, ductal lavage, vaginal fluid,nasal fluid, ear fluid, gastric fluid, pancreatic fluid, trabecularfluid, lung lavage, urine, cerebrospinal fluid, saliva, sweat,pericrevicular fluid, semen, prostatic fluid, and tears.

In any of the embodiments herein, the detection device can be a massspectrometer, more preferably a time-of-flight (TOF) mass spectrometer,or more preferably an orthogonal acceleration, time-of-flight (OA-TOF)mass spectrometer (MS).

In any of the embodiments herein, the separation is performed byelectrophoresis, more preferably, capillary electrophoresis, or morepreferably zone capillary electrophoresis.

According to one aspect, the present invention relates to a method forscreening an organism for a biological state or condition of interestcomprising the steps of: obtaining a sample from the patient; providinga system comprising: a mass spectrometer and a microfluidic deviceadapted for sample separation, wherein the microfluidic device has aelectrospray ionization interface to the mass spectrometer; anddetermining if the sample from the patient includes a marker for thebiological state or condition of interest.

In any of the embodiments herein an organism and/or a patient ispreferably a human; the sample is a body fluid; the sample herein ispreferably a blood, serum or plasma sample; and the biological state orcondition of interest is selected from the group consisting of: cancer,cardiovascular disease, inflammatory disease, infectious disease,autoimmune disease, neurological disease, and pregnancy relateddisorders.

A marker identified or used by the methods and systems herein can be apolypeptide, nucleic acid, lipid, small molecule, or any othercomposition or compound. In some embodiments, a marker is a polypeptideor a small molecule.

According to one aspect, the present invention relates to businessmethods.

In one embodiment, the business methods herein comprise: identifying oneor more markers using a system comprising: a mass spectrometer and amicrofluidic device adapted for sample separation, wherein themicrofluidic device has an electrospray ionization interface to the massspectrometer (more preferably electrospray ionization); andcommercializing the one or more markers identified in the above step ina diagnostic product. The biomarkers identified are preferablypolypeptides or small molecules. Such polypeptides can be previouslyknown or unknown. The diagnostic product herein can include one or moreantibodies that specifically binds to the marker (e.g., polypeptide).

In one embodiment, the business methods herein comprise: identifying oneor more markers using a system comprising: a mass spectrometer and amicrofluidic device adapted for sample separation, wherein themicrofluidic device has an electrospray ionization interface to the massspectrometer; and providing a diagnostic service to determine if anorganism has or does not have a biological state or condition ofinterest. A diagnostic service herein may be provided by a CLIA approvedlaboratory that is licensed under the business or the business itself.The diagnostic services herein can be provided directly to a health careprovider, a health care insurer, or a patient. Thus the business methodsherein can make revenue from selling e.g., diagnostic services ordiagnostic products.

According to one embodiment of the invention, a business method isprovided that includes the steps of collecting more than 10 case samplesrepresenting a clinical phenotypic state and more than 10 controlsamples representing patients without said clinical phenotypic state;using a mass spectrometry platform system to identify patterns ofpolypeptides in said case samples and in said control samples withoutregard to the specific identity of at least some of said proteins;identifying representative patterns of the phenotypic state; andmarketing diagnostic products using said representative patterns. Suchpatterns contain preferably more than 15 polypeptides that arerepresented on output of said mass spectrometer, but the identity of atleast some of said more than 15 polypeptides is not known.

INCORPORATION BY REFERENCE

All publications and patent applications mentioned in this specificationare herein incorporated by reference to the same extent as if eachindividual publication or patent application was specifically andindividually indicated to be incorporated by reference.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 a diagram illustrating preferred aspects of the inventions andsystems used herein.

FIG. 2 illustrates a timing diagram showing operation of a parallelsystem.

FIG. 3 illustrates an SDS PAGE gel of serum with and withoutdenaturation of serum with acid prior to ultrafiltration. Lane 1 of FIG.3 is 0.025 μL of unprocessed serum; Lane 2 of FIG. 3 is 40 μL serumdiluted 1:10 with water, passed thru 30 kD MWCO membrane; Lane 3 of FIG.3 is 40 μL serum diluted 1:10 with water, passed through 50 kD MWCOmembrane; Lanes 4 of FIG. 3 is 40 μL serum diluted 1:10 with 1% formicacid, passed thru 30 kD MWCO membrane, Lane 5 of FIG. 3 is 40 μL serumdiluted 1:10 with 1% formic acid, passed through 50 kD MWCO membrane.

FIG. 4 illustrates results of an experiment addressing the tradeoffbetween signal gain and resolution for zone electrophoresis (“ZE”)versus transient isotachophoresis-zone electrophoresis (“tiRP-ZE”)separations conducted using a capillary electrophoresis-electrosprayionization-mass spectrometry system.

FIG. 5( a) illustrates results of an experiment comparing base peakintensity (BPI) traces for pooled human serum separated by zoneelectrophoresis (lower trace) and by transient isotachophoresis-zoneelectrophoresis (upper trace).

FIG. 5( b) illustrates overlapping results for the two separations shownin FIG. 5( a).

FIG. 6 represents the CE-MS data illustrated in a two-dimensional (2-D)format, similar to that obtained through 2-D polyacrylamide gelelectrophoresis (PAGE). The x-axis represents the mass-to-charge ratioand the y-axis represents the separation time. Mass spectra are acquiredas components come out of the capillary or chip. Black regions representmass-to-charges and separation times where components are observed.White regions represent those were no components are observed.

FIG. 7 illustrates the migration time of neurotensin, one of thepost-processing standards, plotted as a function of run order.

FIG. 8 illustrates the average mass spectra results for substance P (m/z674.4, +2 charge state) where the difference in concentration betweenselected Groups A and B was 4-fold.

FIG. 9 illustrates various range abundances of various components inserum. Classical plasma proteins are high abundance components that arepreferably removed from a sample prior to analysis.

FIG. 10 shows the results of an experiment addressing the separation ofa mixture of seven polypeptides in acetonitrilic (bottom trace) andmethanolic (top trace) solutions conducted using a capillaryelectrophoresis (CE)-electrospray ionization (ESI)-mass spectrometry(MS) system.

FIG. 11 illustrates an exemplary microfluidic device. The microfluidicdevice has a curved separation channel, a second channel for applicationof the electrospray/electrophoresis voltage, and the electrosprayemitter tip. The tip is protected from mechanical damage by plasticextensions on either side.

FIG. 12 illustrates a two dimensional plot of a serum separation fromthe microfluidic device-electrophoresis-electrospray ionization massspectrometry system.

FIG. 13 illustrates an expanded view of the electrospray tip.

FIG. 14 illustrates a TOF-MS coupled to a separation device.

FIG. 15 illustrates a mass spectrum comparison of a serum sampleprocessed with and without pepstatin A.

FIGS. 16A and 16B illustrate mass spectra of a sample without pepstatinA (FIG. 16A) and with pepstatin A (FIG. 16B).

FIG. 17 is a schematic representation of the experimental design.

FIG. 18 is a schematic representation of an embodiment of the samplepreparation process.

FIG. 19 is an overall flowchart illustrating the operation of oneembodiment of the business method.

FIG. 20 illustrates one mass spectrometer that may be used herein.

DETAILED DESCRIPTION OF THE INVENTION

The term “organism” as used herein refers to any living being comprisedof a least one cell. An organism can be as simple as a one cell organismor as complex as a mammal. An organism of the present invention ispreferably a mammal. Such mammal can be, for example, a human or ananimal such as a primate (e.g., a monkey, chimpanzee, etc.), adomesticated animal (e.g., a dog, cat, horse, etc.), farm animal (e.g.,goat, sheep, pig, cattle, etc.), or laboratory animal (e.g., mouse, rat,etc.). Preferably, an organism is a human.

The term “polypeptide,” “peptide,” “oligopeptide,” or “protein” as usedherein refers to any composition that includes two or more amino acidsjoined together by a peptide bond. It may be appreciated thatpolypeptides can contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally occurring amino acids. Also,polypeptides can include one or more amino acids, including the terminalamino acids, which are modified by any means known in the art (whethernaturally or non-naturally). Examples of polypeptide modificationsinclude e.g., by glycosylation, or other post-translationalmodification. Modifications which may be present in polypeptides of thepresent invention include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of apolynucleotide or polynucleotide derivative, covalent attachment of alipid or lipid derivative, covalent attachment of phosphotidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent cross-links, formation of cystine, formation ofpyroglutamate, formylation, gamma-carboxylation, glycation,glycosylation, GPI anchor formation, hydroxylation, iodination,methylation, myristoylation, oxidation, proteolytic processing,phosphorylation, prenylation, racemization, selenoylation, sulfation,transfer-RNA mediated addition of amino acids to proteins such asarginylation, and ubiquitination.

Overview

The business methods herein utilize and apply a system that is able todifferentiate biological states with reliability, reproducibility, andsensitivity. Additionally, the systems herein can be used todifferentiate biological states or conditions with reliability,reproducibility, and sensitivity. The system and methods herein involvethe process of obtaining sample from organism(s); preparing thesample(s)—e.g., preferably by denaturing sample component(s); separatingcomponents of the sample—e.g., using capillary electrophoresis, suchthat various components travel at various speeds; inputting the samplesinto a detection device—e.g., a mass spectrometer; and analyzing massspectra patterns to detect markers that are associated with a particularbiological state.

The preparation and separation steps herein can be accomplished usingany means known in the art. In some embodiments, either or both thepreparation and separation steps occur on a microfluidic device. Suchdevice is preferably disposable. When the methods herein involve the useof a mass spectrometer, a microfluidic device of the inventionpreferably provides a tip adapted for electrospraying the sample intothe mass spectrometer. In some embodiments, the tip is adapted forsheath spraying. In some embodiments, the tip is adapted for non-sheathspraying. In any of the embodiments herein the mass spectrometer mayinclude a disposable inlet capillary.

In one embodiment, the system relies on an integrated, reproducible,sample preparation, separation and electrospray ionization system in amicrofluidic format, with high sensitivity mass spectrometry andinformatics. These systems can serve as the foundation for the discoveryof patterns of markers, including polypeptides, that reflect anddifferentiate biological states or conditions specific for variousstates of health, disease, etc.

The present invention relates to systems and methods (including businessmethods) for identifying unique patterns that can be used for diagnosinga biological state or a condition in an organism, identifying markersbased on the patterns, preparing diagnostics based on such markers, andcommercializing/marketing diagnostics and services utilizing suchdiagnostics.

Markers of the present invention may be, for example, any compositionand/or molecule or a complex of compositions and/or molecules that isassociated with a biological state of an organism (e.g., a conditionsuch as a disease or a non-disease state). A marker can be, for example,a small molecule, a polypeptide, a nucleic acid, such as DNA and RNA, alipid, such as a phospholipid or a micelle, a cellular component such asa mitochondrion or chloroplast, etc. Markers contemplated by the presentinvention can be previously known or unknown. For example, in someembodiments, the methods herein may identify novel polypeptides that canbe used as markers for a biological state of interest or condition ofinterest, while in other embodiments, known polypeptides are identifiedas markers for a biological state of interest or condition.

The systems and methods herein can rely on a microfluidic device, adetection device (e.g., a mass spectrometer), and an informatics tool toprovide an integrated, reliable, reproducible, and sensitive analysis ofa complex sample mixture. It shall be understood that various aspects ofthe invention described herein can be applied individually,collectively, or in different combinations with each other.

In some embodiments, the systems and methods herein are used todifferentiate biological states or conditions with reliability,reproducibility, and sensitivity. In one embodiment, the system relieson an integrated, reproducible, sample preparation, separation andelectrospray ionization system in a microfluidic format, with highsensitivity mass spectrometry and informatics. This system serves as thefoundation for the discovery of patterns of markers, such aspolypeptides, small molecules, or other biological markers that reflectand differentiate biological states or conditions specific for variousstates of health and disease. For purposes herein, polypeptides include,e.g., proteins, peptides, and/or protein fragments.

These patterns of markers (e.g., polypeptides) reflect and differentiatebiological states or conditions and can be utilized in clinically usefulformats and in research contexts. Clinical applications includedetection of disease; distinguishing disease states to inform prognosis,selection of therapy, and the prediction of therapeutic response;disease staging; identification of disease processes; prediction ofefficacy; prediction of adverse response; monitoring of therapyassociated efficacy and toxicity; and detection of recurrence.

The system used herein may be utilized in both the applications ofstudying protein patterns that distinguish case and control samples,and/or in using patterns to diagnose individuals. FIG. 19 illustratesthe overall process of the business methods disclosed herein. At step101 the involved business (alone or with collaborators) collects arepresentative sample set of case samples and control samples. Casesamples are those wherein a patient exhibits a particular biologicalstate or condition, such as, for example, a disease state or otherphenotype state. For example, the case samples may be those where apatient exhibits a response to a drug. Conversely, the control samplesare collected from patients that do not exhibit the phenotype understudy, such as those that do not have the disease or response to a drug.

Preferably more than 10 case and 10 control samples are collected foruse or for identifying marker or protein signals of interest. Preferablymore than 20 case and 20 control samples, preferably more than 50 caseand 50 control samples, preferably more than 100 case and 100 controlsamples, and most preferably more than 500 case and 500 control samplesare collected.

At step 103, the case and control samples are assayed to identifypatterns of markers that are present in the case and control samples. Inpreferred embodiments the markers are polypeptides such as proteins,although they may also include small molecules, nucleic acids,polysaccharides, metabolites, lipids, or the like. Preferably, thepatterns are obtained without advance selection or screening of theparticular polypeptides involved. In some embodiments, the patterns areobtained without identification of some or all of the markers that areshown in the pattern. Three conceptual patterns are illustrated forcases at 104 a and controls at 104 b. As shown, the patterns are greatlysimplified from those that will be actually observed.

Preferably the assay identifies the presence of more than 100polypeptides, preferably more than 200 polypeptides, more preferablymore than 500 polypeptides, more preferably more than 1000 polypeptides,and more preferably more than 2000 polypeptides. While the identity ofsome of the polypeptides will be known from prior studies, it is notnecessary to specifically identify all of the polypeptides indicated bythe assay. Instead, the business takes advantage of the presence of (orabsence of) a pattern of many polypeptides repeatedly found to be in thecases in a pattern distinct from the controls. In various embodiments anumber of polypeptides are represented in the pattern, but the identityof some of these polypeptides is not known. For example, more than 15polypeptides can be represented, more than 30 polypeptides can berepresented, more than 50 polypeptides can be represented, more than 100polypeptides can be represented, and more than 1000 polypeptides can berepresented

The case and control samples are assayed to identify patterns of markersthat are present in the case and control samples. In preferredembodiments the markers are polypeptides such as proteins, although theymay also include small molecules, nucleic acids, polysaccharides,metabolites, lipids, or the like. Preferably, the patterns are obtainedwithout advance selection or screening of the particular polypeptidesinvolved. In some embodiments, the patterns are obtained withoutidentification of some or all of the markers that are shown in thepattern. Preferably, more than 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,90%, 95%, or 99% markers in a sample are known.

In some embodiments, an assay identifies the presence of more than 100markers, preferably more than 200, 300, or 400 markers, more preferablymore than 500, 600, 700, 800, or 900 markers, more preferably more than1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900 markers, andmore preferably more than 2000 markers. Preferably, the assay identifiedthe presence of more than 100 polypeptides, preferably more than 200polypeptides, more preferably more than 500 polypeptides, morepreferably more than 1000 polypeptides, and more preferably more than2000 polypeptides. While the identity of some of the markers orpolypeptides is known from prior studies, it is not used to identifyspecifically all of the markers or polypeptides indicated by the assay.The presence of (or absence of) a pattern of many markers orpolypeptides repeatedly found to be in the cases in a pattern distinctfrom the controls can be used in the study of phenotypes and/ordiagnostics. In various embodiments, a number of markers or polypeptidesare represented in the pattern, but the identity of some of thesemarkers or polypeptides is not known. In some embodiments, more than 15markers can be represented, more than 30 markers can be represented,more than 50 markers can be represented, more than 100 markers can berepresented, and more than 1000 markers can be represented. In someembodiments, more than 15 polypeptides can be represented, more than 30polypeptides can be represented, more than 50 polypeptides can berepresented, more than 100 polypeptides can be represented, and morethan 1000 polypeptides can be represented.

In any of the embodiments herein, at least 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55,60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170,180, 190, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200,1300, 1400, 1500 1600, 1700, 1800, 1900, or 2000 markers (e.g.,polypeptides) are used to distinguish case individuals from controlindividuals.

In preferred embodiments, the business relies on a mass spectrometrysystem to perform the assays. Preferably such systems and methods allowfor the capture and measure of many or all of the instances of a markeror polypeptide in a sample that is introduced in the mass spectrometerfor analysis. Using such systems it is preferable that one can observethose markers or polypeptides with high information-content but that areonly present at low concentrations, such as those “leaked” from diseasedtissue. Other high information-content markers or polypeptides may bethose that are related to the disease, for instance, those that aregenerated in the tumor-host environment.

In some embodiments, an early assay, or discovery experiment, such asthe first assay, is followed by a later assay. The early assay isnormally used in initial identification of markers or polypeptides thatidentify or separate cases from controls. The later assay is adjustedaccording to parameters that can focus diagnostics or evaluation ofregions of interest, such as regions of high differentiation orvariability, i.e. those regions or markers where there are significantdifferences between case samples and control samples. The parameters canbe determined by, for example, an early assay which may identify theregions of interest, which may be on one technology platform, and alater assay on the same or a different platform.

At step 105, bioinformatics system are utilized to identify thedifferences in patterns, or the polypeptide patterns, in the case andcontrol samples. Such techniques may be proceeded by various datacleanup steps. Patterns can be composed of the relative representationof numerous markers (e.g., polypeptides, other biological entities,small molecules, etc.), the collective profile of which is moreimportant than the presence or absence of any specific entities. Byidentifying patterns in blood or other patient samples, the methodsherein do not only provide the window to the presence of disease andother pathology in some embodiments, but also to the body's ongoingresponse to the disease or pathologic condition in other embodiments. Ina high throughput mode (pipelined system operation), data from a firstsample are evaluated in a bio-informatics system at the same timeanother sample is being processed in a detection device using, forexample, a mass spectrometry system.

As shown in the three simplified patterns for “cases” 104 a, peaks 106 aand 106 b tend to be observed in three “case” samples at higher levels.Conversely, less or no signal is observed at peak 106 c in the threecase samples. By contrast, in the control samples 104 b, peaks 106 a and106 c tend to be observed while peak 106 b tends to be at low levels. Ofcourse, the patterns shown in FIG. 1 are greatly simplified, and therewill be much more complex patterns in actual practice, such as tens,hundreds, or thousands of such peaks. In the particular exampleillustrated in FIG. 1, peak 106 a is not informative, while peak 106 btends to occur in cases, and peak 106 c tends to occur in controls.Automated systems will generally be applied in the identification of thepatterns that distinguish cases and controls. The measurement ofpatterns of multiple signals will enable the identification of subtledifferences in biological state and make the identification of thatstate more robust and less subject to biological noise.

At step 107 the business uses the patterns of markers (e.g.,polypeptides) present in the sample may be used to identify the diseasestate of a patient sample in, for example, a diagnostic setting. Samplesused in both the steps 101 and 107 can, in preferred embodiments, beserum samples, although tissue or bodily fluid samples from a variety ofsources can be used in alternative embodiments. Preferably, though notnecessarily, the system used in the diagnostic application is based uponthe same technology platform as the platform used to identify thepatterns in the first instance. For example, if the platform used toidentify the patterns in the first instance is a time of flight (TOF)mass spectrometer, it is preferred that the diagnostic applications ofthe patterns are run on a time of flight mass spectrometer.

The marketing of the products can take a number of forms. For example,it may be that the developer actually markets the instruments and assaysinto the diagnostic research market. In alternative embodiments, thedeveloper of the patterns will partner with, for example, a largediagnostic company that will market those products made by thedeveloper, alone or in combination with their own products. Inalternative embodiments, the developer of the patterns licenses theintellectual property in the patterns to a third party and derivesrevenue from licensing income arising from the pattern information.

The business method herein can obtain revenue by various means, whichmay vary over time. Such sources may include direct sale revenue ofproducts, upfront license fees, research payment fees, milestonepayments (such as upon achievement of sales goals or regulatoryfilings), database subscription fees, and downstream royalties and fromvarious sources including government agencies, academic institution anduniversities, biotechnology and pharmaceutical companies, insurancecompanies, and health care providers.

Often, diagnostic services hereunder will be offered by clinicalreference laboratories or by way of the sale of diagnostic kits.Clinical reference laboratories generally process large number ofpatient samples on behalf of a number of care givers and/orpharmaceutical companies. Such reference laboratories in the UnitedStates are normally qualified under CLIA and/or CAP regulations. Ofcourse, other methods may also be used for marketing and sales such asdirect sales of kits such as FDA or equivalent approved products. Insome cases the developer of the pattern content will license theintellectual property and/or sell kits and/or reagents to a referencelaboratory that will combine them with other reagents and/or instrumentsin providing a service.

In the short term, the business methods disclosed generate revenue by,for example, providing application specific research or diagnosticservices to third parties to discover and/or market the patterns.Examples of third-parties include customers who purchase diagnostic orresearch products (or services for discovery of patterns), licensees wholicense rights to pattern recognition databases, and partners whoprovide samples in exchange for downstream royalty rights and/or upfront payments from pattern recognition. Depending on the fee,diagnostic services may be provided on an exclusive or non-exclusivebasis.

Revenue can also be generated by entering into exclusive and/ornon-exclusive contracts to provide polypeptide profiling of patients andpopulations. For example, a company entering clinical trials may wish tostratify a patient population according to, for example, drug regimen,effective dosage, or otherwise. Stratifying a patient population mayincrease the efficacy of clinical trial (by removing, for example, nonresponders), thus allowing the company to enter into the market sooneror allow a drug to be marketed with a diagnostic test that identifiespatients that may have an adverse response or be non-responsive. Inaddition, insurance companies may wish to obtain a polypeptide profileof a potential insured and/or to determine if, for example a drug ortreatment will be effective for a patient.

In the long term, revenue may be generated by alternative methods. Forexample, revenue can be generated by entering into exclusive and/ornon-exclusive drug discovery contracts with drug companies (e.g.,biotechnology companies and pharmaceutical companies). Such contractscan provide for downstream royalties on a drug based on theidentification or verification of drug targets (e.g., a particularprotein or set of polypeptides associated with a phenotypic state ofinterest), or on the identification of a subpopulation in which suchdrug should be utilized. Alternatively, revenue may come from a licenseefee on a diagnostic itself. The diagnostic services, patterns, and toolsherein can further be provided to a pharmaceutical company in exchangefor milestone payments or downstream royalties. Revenue may also begenerated from the sale of disposable fluidics devices, disposablemicrofluidics devices, or other assay reagents or devices in for examplethe research market, diagnostic market, or in clinical referencelaboratories. Revenue may also be generated from licensing ofapplications-specific software or databases. Revenue may, still further,be generated based on royalties from technology platform providers whomay license some or all of the proprietary technology. For example, amass-spectrometer platform provider may license the right to furtherdistribute software and computer tools and/or polypeptide patterns.

In preferred embodiments, the mass spectrometer or TOF device utilizedherein is coupled to a microfluidic device, such as a separationsdevice. The sample preparation techniques used preferably concentratethe markers (e.g., polypeptides or small molecules) the massspectrometer is best able to detect and/or are which are mostinformative, and deplete the ones that are more difficult to detectand/or are less informative (because, for example, they appear in bothcase and control samples). Prepared samples may then be placed on amicrofluidic device, separated and electrosprayed into a massspectrometer.

In most preferred embodiments the microfluidic separations device is adisposable device that is readily attached to and removed from the massspectrometer, and sold as a disposable, thereby providing a recurringrevenue stream to the involved business and a reliable product to theconsumer. Preferably, a mass spectrometer is utilized that accepts acontinuous sample stream for analysis and provide high sensitivitythroughout the detection process.

Any of the methods and systems herein can be automated to require nomanual intervention for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or morepreferably at least 10 hours.

Sample preparation, in some embodiments, includes the removal of highabundance markers or polypeptides, denaturation, removal of markers orpolypeptides expected to be in abundance in all samples, addition ofpreservatives and calibrants, and desalting. These steps allow sensitivemeasurement of concentrations of information-rich markers, or morepreferably information-rich polypeptides, such as those that have leakedfrom tissue, as compared to markers or polypeptides that would carrylittle information, such as those highly abundant and native to serum.Prepared samples can then be separated using fast molecular separationsmethods with high peak capacities. An electrospray-ionization (ESI)interface may be integrated on the microfluidic device (chip), whichionizes and sprays the prepared and separated sample directly into amass spectrometer and is preferably sold as part of a disposablecomponent to assure that there is no carry-over between samples, and toassure high reliability of the system.

In another embodiment, the system's reproducibility and resolutionallows for the differentiation of different levels of markers betweencase and control samples, even for high abundance components that arenot removed by the sample preparation steps. The system resolutionallows for the differentiation of modified forms of the components, e.g.modified polypeptides, in which the modification or the level of themodified molecule is the marker.

The microfluidic-based separations preferably provide the markermixtures and polypeptide mixtures at flow rates and at complexity levelsthat are matched to the mass spectrometer's optimal performance regions.The mass spectrometer's sensitivity is preferably optimized to detectthe species most likely to differentiate between biological states orconditions. Preferably, the reagents used for performing these steps areprovided in or along with the microfluidic device, thereby allowing foradditional recurring revenue to the involved business and higherperformance for the user.

The sample preparation system provides for different operationsdepending upon the detection device to be utilized. The samplepreparation system preferably provides for protein denaturation prior toprocessing on the mass spectrometer. Analytes of interest herein may bein some cases a protein in a bound form. Preferably the system providesfor denaturation of proteins preferably prior to the removal of highabundance materials (such as albumin or other proteins from serum orplasma samples). By denaturing such proteins prior to their removal,bound analytes of interest can be released such that they can bemeaningful in later analysis. Denaturation may utilize any of severaltechniques including the use of heat, high salt concentrations, the useof acids, base, chaotropic agents, organic solvents, detergents and/orreducing agents. Liotta, Lance, A., et al., “Written in Blood,” Nature(Oct. 30, 2003), Volume 425, page 905. Tirumalai, Radhakrishna S., etal. “Characterization of the Low Molecular Weight Human Serum Proteome,”Molecular & Cellular Proteomics 2.10 (Aug. 13, 2003), pages 1096-1103.

The system used for removal of high abundance markers (e.g.,polypeptides) may be based on, for example, the use of high affinityreagents for removal of the markers (e.g., polypeptides), the use ofhigh molecular weight filters, ultracentrifugation, precipitation,and/or electrodialysis. Polypeptides that are often be removed include,for example, those involved in normal metabolism, and a wide variety ofother indications not of relevance to a particular assay. Such markersor proteins may be removed through, for example, a solid phaseextraction resin or using a device that removes such proteins withantibodies (e.g., Agilent's High-Capacity Multiple Affinity RemovalSystem). Additionally, the system may include a reversed phasechromatography device, for example, for separation or fractionation ofsmall molecules and/or to trap, desalt, and separate or fractionate amarker or protein mixture.

FIG. 1 illustrates additional aspects of an exemplary system platformused herein. The invention involves an integrated system to a) discover;and b) assay patterns of markers including polypeptides that reflect anddifferentiate biological and clinical states of organisms, includingpatients, in biological materials including but not limited to bodyfluids.

Biological and clinical states include but are not limited to phenotypicstates; conditions affecting an organism; states of development; age;health; pathology; disease detection, process, or staging; infection;toxicity; or response to chemical, environmental, or drug factors (suchas drug response phenotyping, drug toxicity phenotyping, or drugeffectiveness phenotyping).

Biological fluids 201 include but are not limited to serum, plasma,whole blood, nipple aspirate, ductal lavage, vaginal fluid, nasal fluid,ear fluid, gastric fluid, pancreatic fluid, trabecular fluid, lunglavage, urine, cerebrospinal fluid, saliva, sweat, pericrevicular fluid,semen, prostatic fluid, and tears.

The system provides for the integration of fast molecular separationsand electrospray ionization system 204 on a microfluidic platform 203.The system provides processed samples to a high sensitivity time offlight mass spectrometer 205. Signal processing system and patternextraction and recognition tools 207 incorporate domain knowledge toextract information from polypeptide patterns and classify the patternsto provide a classification 209. The signal processing system mayinclude or be coupled to other software elements as well. For example,the signal processing system may provide for an easy to use userinterface on the associated computer system and/or a patient databasefor integration of results into an institution's laboratory or patientinformation database system.

The microfluidic device(s) 203 and 204 may be formed in plastic by meansof etching, machining, cutting, molding, casting or embossing. Themicrofluidic device(s) may be made from glass or silicon by means ofetching, machining, or cutting. The device may be formed bypolymerization on a form or other mold. The device may be made from apolymer by machining, cutting, molding, casting, or embossing. Themolecular separations unit or the integrated fast molecularseparations/electrospray ionization unit may provide additional samplepreparation steps, including sample loading, sample concentration,removal of salts and other compounds that may interfere withelectrospray ionization, removal of highly abundant species, selectivecapture of specific molecules, with affinity reagents concentration ofthe sample to a smaller volume, proteolytic or chemical cleavage ofcomponents within the biological material, enzymatic digestion, and/oraliquoting in to storage containers. The particular operations performedby the device depend upon the detection technology that is utilized.

The device(s) for separations and electrospray may be either single usefor a single sample, multi-use for a single sample at a time with serialloading, single use with parallel multiple sample processing, multi-usewith parallel multiple sample processing or a combination. Separationsprocesses may include isoelectric focusing, electrophoresis,chromatography, or electrochromatography. The separations device mayinclude collection areas or entities for some or all of the purified orpartially purified fractions.

It is to be understood that the inventions herein are illustratedprimarily with regard to mass spectrometry as a detection device, butother devices may be used alone or with the mass spectrometer. Forexample, detection devices may include electrochemical, spectroscopic,or luminescent detectors, and may be integral with the microfluidicsdevice.

Mass spectrometers that may be used include quadrupole, ion trap,magnetic sector, orbitrap Fourier transform ion cyclotron resonanceinstruments, or an orthogonal multiplex time-of-flight mass spectrometerwhich includes an analyzer that receives an ion beam from anelectrospray ionization (ESI) source.

FIG. 20 illustrates a mass spectrometer system 205 in greater detail inone specific embodiment of the invention. In FIG. 20, an orthogonalmultiplex time-of-flight mass spectrometer which includes an analyzerthat receives an ion beam from an electrospray ionization (ESI) source301 such as disclosed in U.S. Ser. No. 10/395,023. By “multiplex” inthis context it is intended to mean a system that processes multiple ionpackets at the same time. The ion beam is initially introduced intoanalyzer 303 along an axis 305, and the analyzer generally accumulatesdiffering size packets of ions of the beam and accelerates the packetsof ions laterally along a flight path 307. The pulses or packets of ionsare spaced in time and along the flight path by different accumulationperiods, and the speed of travel of the ions along flight path 307varies with a mass-to-charge ratio (m/z) such that the ions ofsequential pulses, and often the ions of three or more pulses, willarrive intermingled at one time at a detector 309.

In addition to analyzer 303, the system includes a driver 311 tointermittently energize lateral acceleration electrodes of analyzer 303.Driver 311 modulates or encodes the beam with the pseudorandom sequenceby reference to a clock signal supplied from a multichannel scaler 313.Driver 311 also supplies a trigger signal to the multichannel scaler 313to signal the start of a sequence. An output signal from detector 309 isamplified by an amplifier 315 and is counted by multichannel scaler 313.

The pseudorandom sequence applied by driver 311 will typically providefor time periods which may each be defined as integer multiples of aunit accumulation time. To facilitate reconstruction of a spectrum fromthe signal generated by detector 309, multichannel scaler 313 may countthe amplified signal from amplifier 315 into time bins which representintegral fractions of this unit time. These counts can then be sent to acomputer 317 for reconstruction of a particular spectra andcharacterization of the sample material introduced into the system viaESI source 301.

Computer 317 may also control a variety of additional components ofsystem 205, with a wide variety of alternative data processing beingpossible. The structure and use of driver 311, multichannel scaler 313,amplifier 315 and computer 317 may in some embodiments be those such asshown in U.S. Pat. No. 6,300,626 issued to Brock et al. and entitled“Time-of-Flight Mass Spectrometer and Ion Analysis” on Oct. 9, 2001,which is fully incorporated by reference along with all other referencescited in this application.

In preferred embodiments the system also adapts the speed of the systemin response to the detection of known markers that are likely to bepresent in all samples, and which are readily detectable. Sinceseparations may often vary in retention or migration time, by detectingmolecules that are known, likely to be in all samples, and easilydetectable, and then comparing the speed at which they have passedthrough the system in comparison to a standard from other experiments,it becomes possible to speed the system up by speeding the separationsin response to the detection of slower than expected migration time, orslowing the system down in response to faster than expected migrationtimes. The speed may be adjusted through, for example, adjustments insystem pressure, voltage, current flow, or temperature. Preferably, thesystem is operated faster or slower by changing the voltage. Thus thespeed of the system can be fine tuned to detect specific markers.

Representative markers (e.g., peptides and proteins) that could bespiked into samples for quality control include neurotensin, lysozyme,aprotinin, insulin b-chain, and renin substrate. In addition, the speedof operation of the device may be slowed to provide greater accuracy inthe detection of molecules of particular interest in a spectrum.Conversely, the system may be operated more quickly during the timeswhen components of low interest would be expected to be detected.

In some embodiments pressure is added to move the components through theelectrophoretic device, especially to migrate components to the end ofan electrophoretic separation capillary (in conjunction with the use ofthe electro osmotic flow). The pressure produces buffer flow that isused to maintain a stable electrospray.

Ions formed by electrospray ionization may be singly or multiply chargeions of molecules, with charge coming from protons or alkali metal boundto the molecules. Ion excitation may be produced by collision of ionswith background gas or an introduced collision gas. Alternatively,excitation may be from collision with other ions, a surface, interactionwith photons, heat, electrons, or alpha particles. Through excitation ofthe sample in an electrospray the information content of the processshould be altered and/or enhanced. Such excitation may, for example,desolvate ions, dissociate noncovalently bound molecules from analyteions, break up solvent clusters, fragment background ions to changetheir mass to charge ratio and move them to a ratio that may interfereless with the analysis, strip protons and other charge carriers suchthat multiply charged ions move to different regions of the spectrum,and fragment analyte ions to produce additional, more specific orsequence-related information.

In preferred embodiments the excitation system may be turned on and offto obtain a set of spectra in both states. The information content ofthe two spectra is, in most cases, far greater than the informationcontent of either single spectra. In such embodiments the systemincludes a switching device for activating and de-activating theexcitation/ionization system. Analysis software is configured in thiscase to analyze the sample separately both in the “on” state of theexcitation system and in the “off” state of the excitation system.Different markers may be detected more efficiently in one or the otherof these two states.

FIG. 2 illustrates the pipelined systems operations in greater detail.As shown at step 351, a first sample is acquired during this time frameand separated in the microfluidics device, and then processed in themass spectrometer. At step 353 a second sample is processed in themicrofluidics device and processed in the mass spectrometer. During atleast some of the time when second sample is being processed at step353, the data from the mass spectrum for the first sample are processedin the data analysis system at step 357. Similarly, at step 355 a thirdsample is processed in the microfluidics device and the massspectrometer, while the data from sample 2 are being analyzed in thedata analysis system at step 359.

Sample Collection

In some embodiments, the system and methods (including business methods)herein involve obtaining sample(s) from organism(s) as is illustrated inFIG. 1, element 201. Preferably the organism is a human. Such samplescan be in liquid or non-liquid form.

Examples of liquid samples that can be obtained from an organism, suchas a patient, include, but are not limited to, serum, plasma, wholeblood, nipple aspirate, ductal lavage, vaginal fluid, nasal fluid, earfluid, gastric fluid, pancreatic fluid, trabecular fluid, lung lavage,urine, cerebrospinal fluid, saliva, sweat, pericrevicular fluid, semen,prostatic fluid, and tears.

Examples of non-liquid samples include samples from tissue, bone, hair,cartilage, tumor cells, etc. Non-liquid samples may be dissolved in aliquid medium, containing, e.g., detergent, chaotrope, denaturant, acid,base, protease or reducing agent prior to further analysis.

In preferred embodiments, samples collected are in liquid form.Preferably, samples collected are serum or plasma.

Case samples are obtained from individuals with a particular phenotypicstate of interest. Examples of phenotypic states include, phenotypesresulting from an altered environment, drug treatment, geneticmanipulations or mutations, injury, change in diet, aging, or any othercharacteristic(s) of a single organism or a class or subclass oforganisms. In a preferred embodiment, a phenotypic state of interest isa clinically diagnosed disease state. Such disease states include, forexample, cancer, cardiovascular disease, inflammatory disease,autoimmune disease, neurological disease, infectious disease andpregnancy related disorders. Control samples are obtained fromindividuals who do not exhibit the phenotypic state of interest ordisease state (e.g., an individual who is not affected by a disease orwho does not experience negative side effects in response to a givendrug). Alternatively, states of health can be analyzed.

Cancer phenotypes are studied in some aspects of the invention orbusiness method. Examples of cancer include, but are not limited to:breast cancer, skin cancer, bone cancer, prostate cancer, liver cancer,lung cancer, brain cancer, cancer of the larynx, gallbladder, pancreas,rectum, parathyroid, thyroid, adrenal, neural tissue, head and neck,colon, stomach, bronchi, kidneys, basal cell carcinoma, squamous cellcarcinoma of both ulcerating and papillary type, metastatic skincarcinoma, osteo sarcoma, Ewing's sarcoma, veticulum cell sarcoma,myeloma, giant cell tumor, small-cell lung tumor, non-small cell lungcarcinoma gallstones, islet cell tumor, primary brain tumor, acute andchronic lymphocytic and granulocytic tumors, hairy-cell tumor, adenoma,hyperplasia, medullary carcinoma, pheochromocytoma, mucosal neuronms,intestinal ganglloneuromas, hyperplastic corneal nerve tumor, marfanoidhabitus tumor, Wilm's tumor, seminoma, ovarian tumor, leiomyomatertumor, cervical dysplasia and in situ carcinoma, neuroblastoma,retinoblastoma, soft tissue sarcoma, malignant carcinoid, topical skinlesion, mycosis fungoide, rhabdomyosarcoma, Kaposi's sarcoma, osteogenicand other sarcoma, malignant hypercalcemia, renal cell tumor,polycythermia vera, adenocarcinoma, glioblastoma multiforma, leukemias,lymphomas, malignant melanomas, epidermoid carcinomas, and othercarcinomas and sarcomas.

Cardivascular disease may be studied in other applications of theinvention. Examples of cardiovascular disease include, but are notlimited to, congestive heart failure, high blood pressure, arrhythmias,atherosclerosis, cholesterol, Wolff-Parkinson-White Syndrome, long QTsyndrome, angina pectoris, tachycardia, bradycardia, atrialfibrillation, ventricular fibrillation, congestive heart failure,myocardial ischemia, myocardial infarction, cardiac tamponade,myocarditis, pericarditis, arrhythmogenic right ventricular dysplasia,hypertrophic cardiomyopathy, Williams syndrome, heart valve diseases,endocarditis, bacterial, pulmonary atresia, aortic valve stenosis,Raynaud's disease, Raynaud's disease, cholesterol embolism, Wallenbergsyndrome, Hippel-Lindau disease, and telangiectasis.

Inflammatory disease and autoimmune disease may be studied in otherapplications of the system or business method. Examples of inflammatorydisease and autoimmune disease include, but are not limited to,rheumatoid arthritis, non-specific arthritis, inflammatory disease ofthe larynx, inflammatory bowel disorder, psoriasis, hypothyroidism(e.g., Hashimoto thyroidism), colitis, Type 1 diabetes, pelvicinflammatory disease, inflammatory disease of the central nervoussystem, temporal arteritis, polymyalgia rheumatica, ankylosingspondylitis, polyarteritis nodosa, Reiter's syndrome, scleroderma,systemis lupus and erythematosus.

Infectious disease may be studied in still further aspects of the systemor business method. Examples of infectious disease include, but are notlimited to, AIDS, hepatitis C, SARS, tuberculosis, sexually transmitteddiseases, leprosay, lyme disease, malaria, measles, meningitis,mononucleosis, whooping cough, yellow fever, tetanus, arboviralencephalitis, and other bacterial, viral, fungal or helminthic diseases.

Neurological diseases include dementia, Alzheimer disease, Parkinsonsdisease, ALS, MS.

Pregnancy related disorders include pre-eclampsia, eclampsia pre-termbirth, growth restriction in utero, rhesus incompartability, retainedplacenta, septicemia, separation of the placenta, ectopic pregnancy,hypermosis gravidarum, placenta previa, erythroblastosis fetalis,pruritic urticarial papula and plaques.

Samples may be collected from a variety of sources in a given patientdepending on the application of the business. In some embodimentssamples are collected on the account of the company itself, while inother examples they are collected in collaboration with an academiccollaborator or pharmaceutical collaborator that, for example, iscollecting samples in a clinical trial. Samples collected are preferablybodily fluids such as blood, serum, sputum, including, saliva, plasma,nipple aspirants, synovial fluids, cerebrospinal fluids, sweat, urine,fecal matter, pancreatic fluid, trabecular fluid, cerebrospinal fluid,tears, bronchial lavage, swabbings, bronchial aspirants, semen,precervicular fluid, vaginal fluids, pre-ejaculate, etc. In a preferredembodiment, a sample collected is approximately 1 to 5 ml of blood.

In some instances, samples may be collected from individuals over alongitudinal period of time (e.g., once a day, once a week, once amonth, biannually or annually). The longitudinal period may, forexample, also be before, during, and after a stress test or a drugtreatment. Obtaining numerous samples from an individual over a periodof time can be used to verify results from earlier detections and/or toidentify an alteration in polypeptide pattern as a result of, forexample, aging, drug treatment, pathology, etc. Samples can be obtainedfrom humans or non-humans. In a preferred embodiment, samples areobtained from humans.

When obtaining a blood, serum, or plasma sample, a coagulation cascademay activate proteases that can induce clotting and cleave proteins inthe sample. Preferably, such processes can be prevented or their effectreduced. Thus for serum samples, separating clots from the serum as soonas the clotting process is completed, then freezing the serum as quicklyas possible but no longer than within 24 hrs, 12 hrs, 6 hrs, 3 hrs or 1hr. Similarly for plasma samples, the present invention contemplatesremoving cells quickly from the blood sample (e.g., in less than 24 hrs,12 hrs, 6 hrs, 3 hrs, or 1 hr) and the plasma is frozen as soon aspossible. Preferred protocols for sample collection and storage aregiven in Table 1 below.

TABLE 1 Recommended protocols for blood collection and storage. ProcessStep Serum Plasma Tube type Plastic serum separator tube (Plus K₂EDTASST) Clotting time and 30-45 min at room temperature N/A temp Centrifuge10 min at 1100-1300 g at room Within 30 min of venipuncture temperaturecentrifuge for 15 min at 2500 g at room temperature Aliquot and Freezing0.5 mL aliquots to cryovials, and 0.5 mL aliquots to cryovials, andrefrigerated until frozen at −80° C., refrigerated until frozen at −80°C., within 2 hours of venipuncture. within 2 hours of venipuncture.

Sample Preparation

After samples are collected, they are optionally prepared and/orseparated before they are analyzed. Sample preparation and separationcan involve any of the following procedures, depending on the type ofsample collected and/or types of marker or protein searched: removal ofhigh abundance markers or polypeptides (e.g., albumin, and transferrin);addition of preservatives and calibrants, denaturation, desalting ofsamples; concentration of sample markers and/or polypeptides; selectivecapture of specific molecules with affinity reagents; proteindigestions; and fraction collection. Further disruption of proteolyticprocesses by adding protease inhibitors to blood collection tubes ortubes used to store or prepare the blood is also used in someembodiments. Examples of protease inhibitors that may be added to ablood, plasma or serum sample include but are not limited to acidprotease inhibitors, serine protease inhibitors, threonine proteaseinhibitors, cysteine protease inhibitors, aspartic acid proteaseinhibitors, metallo protease inhibitors, and glutamic acid proteaseinhibitors. Examples of common serine protease inhibitors include alpha1-antitrypsin, complement 1-inhibitor, antithrombin, alpha1-antichymotrypsin, plasminogen activator inhibitor 1 (coagulation,fibrinolysis) and neuroserpin. In preferred embodiments, a proteaseinhibitor is an acid protease inhibitor, or more preferably, PepstatinA. Other examples of acid protease inhibitors include Ahpatinins,

In some embodiment, sample preparation may involve denaturation or theaddition of an added solution to the sample.

Exemplary steps for sample preparation are given in Table 2 below:

TABLE 2 Sample preparation procedure. (i) Dilute 50 μL serum to 500 μLin 1% formic acid, 1 μM pepstatin, 300 nM angiotensin III, 1 μMaprotinin (ii) Centrifuge through 50 kDa ultrafiltration membranes (30min., 14,000 x g) (iii) Apply to activated reverse phase resin in 96well plate (Waters μElute plate) - on a vacuum manifold (iv) Wash(desalt) and then elute (70% ACN, 0.1% acetic) Dry under N2 stream (v)Redissolve each well with 5 μL 20% IPA, 0.1% formic acid, 3 μM reninsubstrate, 3 μM bradykin, using two minute vortexing (vi) Freeze @ −20°C. until analysis

FIG. 3 illustrates the efficiency of the sample preparation method forremoval of high MW components and recovery of low MW components. Totalprotein measurement on serum before preparation by denaturation (70mg/mL) and after preparation by denaturation using an acid (70 ug/mL)followed by ultrafiltration released a significant amount of lowermolecular weight components. In particular, FIG. 3 shows an SDS PAGE gelof serum with and without denaturation of serum with acid prior toultrafiltration. Lane 1 of FIG. 3 illustrates protein from 0.025 μl ofunprocessed serum. Lane 2 of FIG. 3 illustrates protein from 40 μL serumdiluted 1:10 with water, passed thru 3010 MWCO membrane. Lane 3 of FIG.3 illustrates 40 μL serum diluted 1:10 with water, passed through 50 kDMWCO membrane. Lanes 4 of FIG. 3 illustrates 40 μL serum diluted 1:10with 1% formic acid, passed thru 30 kD MWCO membrane. Lane 5 of FIG. 3illustrates 40 μL serum diluted 1:10 with 1% formic acid, passed through50 kD MWCO membrane.

FIG. 3 demonstrates that about 99% of polypeptides were depleted bydenaturation prior to separation by ultrafiltration. Recovery ofrepresentative polypeptides averaged 65%, demonstrating the efficiencyof low MW peptide recovery.

Additional examples on the use and effects of protease inhibitors onsample analysis are discussed herein.

Preferably, sample preparation techniques concentrate information-richmarkers or polypeptides (e.g., polypeptides that have “leaked” fromdiseased cells or are produced by the host response to the tumor) anddeplete markers and/or polypeptides that would carry little or noinformation such as those that are highly abundant or native to serum(e.g., classical plasma proteins such as albumin). FIG. 9 illustratesrange abundances of various components/markers in serum. Classicalplasma proteins that are highly abundant are preferably removed from asample prior to analysis.

Sample preparation can take place in a manifold orpreparation/separation device. In preferred embodiment, suchpreparation/separation device is a microfluidic device. Optimally, thepreparation/separation device interfaces directly or indirectly with adetection device. In another embodiment, such preparation/separationdevice is a fluidics device. In yet another embodiment, the preparationdevice is a 96-well plate and the separation device is a microfluidicdevice.

In other preferred embodiments, sample preparation uses conventionalmethods (e.g., pipettes and 96 well plates, while separation takes placeon a microfluidic device.

Approximately 100 μL of a sample or less is analyzed per assay in someparticular embodiments of the invention. Removal of undesired markers orpolypeptides (e.g., high abundance, uninformative, or undetectablepolypeptides) can be achieved using, e.g., high affinity reagents, highmolecular weight filters, size exclusion, untracentrifugation and/orelectrodialysis.

High Affinity Reagents

High affinity reagents include antibodies or aptamers that selectivelybind to high abundance polypeptides or reagents that have a specific pH,ionic value, or detergent strength. Examples of high affinity reagentsthat can be used to remove high abundant, or informatics depletedcomponents from a sample include antibodies and aptamers thatselectively bind to such components (e.g., polypeptide, reagents, etc.).For example, albumin may be removed by specific antibodies (Pieper, R.,et al. (2003) Proteomics 3, 422-32), dyes (e.g. Cibachron Blue),synthetic peptides, and aptamers. Immunoglobulins (e.g., IgG) canreadily bind Protein A and Protein G. Other antibody reagents are alsoavailable for removal of abundant proteins (e.g., Agilent'sHigh-Capacity Multiple Affinity Removal System). In preferredembodiments, a device that removes the highest abundance proteins, suchas Agilent's device, is utilized to remove a high abundant protein.

High Molecular Weight Filters

High molecular weight filters include membranes that separate moleculeson the basis of size and molecular weight. Such filters may furtheremploy reverse osmosis, dialysis, nanofiltration, ultrafiltration andmicrofiltration.

Examples of high molecular weight filters that can be used to removeundesired components from a sample include membranes that separatemolecules on the basis of size and molecular weight. Such membranes mayfurther employ reverse osmosis, dialysis, nanofiltration,ultrafiltration and microfiltration. In some embodiments high molecularweight filters separate out all components that have molecular weightgreater than 1,000 kD, 900 kD, 800 kD, 700 kD, 600 kD, 500 kD, 400 kD,300 kD, 200 kD, 100 kD, 90 kD, 80 kD, 70 kD, 60 kD, 50 kD, 40 kD, 30 kD,20 kD, 10 kD, 1 kD.

Ultracentrifugation

Ultracentrifugation is another method for removing undesired componentsof a sample. Ultracentrifugation can involve centrifugation of a sampleat least about 10,000 rpm, 20,000 rpm, 30,000 rpm, 40,000 rpm, 50,000rpm, 60,000 rpm, 70,000 rpm, 80,000 rpm, 90,000 rpm, or 100,000 rpmwhile monitoring with an optical system the sedimentation (or lackthereof) of particles.

Electrodialysis

Another method for removing undesired components is via electrodialysis.Electrodialysis is an electromembrane process in which ions aretransported through ion permeable membranes from one solution to anotherunder the influence of a potential gradient. Since the membranes used inelectrodialysis have the ability to selectively transportions havingpositive or negative charge and reject ions of the opposite charge,electrodialysis is useful for concentration, removal, or separation ofelectrolytes.

In a preferred embodiment, the manifold or microfluidic device performselectrodialysis to remove high molecular weight markers and polypeptidesor undesired markers and polypeptides. Electrodialysis is first used toallow only molecules under approximately 30 kD (not a sharp cutoff) topass through into a second chamber. A second membrane with a very smallmolecular weight (roughly 500 D) allows smaller molecules such as saltsto egress the second chamber.

In some embodiments, electrodialysis is used to allow only moleculesunder approximately 10 kDa, 20 kDa, 30 kDa, 40 kDa, 50 kDa, 60 kDa, 70kDa, 80 kDa, 90 kDa, 100 kDa to pass through from a first chamber into asecond chamber. A second membrane with a very small molecular weight,e.g., less than 900 Da, 800 Da, 700 Da, 600 Da, 500 Da, 400 Da, 300 Da,200 Da, or 100 Da, allows smaller molecules such as salts to egress thesecond chamber.

Size Exclusion

Another method for separating molecules by molecular weight is sizeexclusion chromatography also called gel-permeation chromatography(GPC). Size exclusion chromatography uses porous particles to separatemolecules of different sizes. In size exclusion chromatography,molecules can flow past a porous resin or be entrapped or entrained in aporous resin. Thus, molecules that are smaller than the pore size canenter the particles and therefore have a longer path and longer transisttime than larger molecules that cannot enter the particles. The lowmolecular weight molecules are collected by passing additional solutionover the resin of particles.

In some of the embodiments herein, depletion of high abundance markerssuch as proteins occurs based on size. For example, in one embodimentspolypeptides >1,000 kD, 900 kD, 800 kD, 700 kD, 600 kD, 500 kD, 400 kD,300 kD, 200 kD, 100 kD, 90 kD, 80 kD, 70 kD, 60 kD, 50 kD, 40 kD, 30 kD,20 kD, 10 kD, 1 kD are removed. More preferably polypeptides >50 kD, 49kD, 48 kD, 47 kD, 46 kD, 45 kD, 44 kD, 43 kD, 42 kD, 41 kD 40 kD, 39 kD,38 kD, 37 kD, 36 kD, 35 kD, 34 kD, 33 kD, 32 kD, 31 kD, 30 kD, 29 kD, 28kD, 27 kD, 26 kD, 25 kD, 24 kD, 23 kD, 22 kD, 20 kD, 19 kD, 18 kD, 17kD, 16 kD, 15 kD, 14 kD, 13 kD, 12 kD, 11 kD, 10 kD, 9 kD, 8 kD, 7 kD, 6kD, 5 kD, 4 kD, 3 kD, 2 kD, or 1 kD are removed. Preferably greater than10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% of such proteinswith the above molecular weight are removed. In other embodiments,depletion of high abundance markers occurs based on binding specificity(e.g., using antibodies).

In one example, sample preparation including denaturation of components(e.g., polypeptides) occurs prior to detection of the sample by adetection device. More preferably, denaturation of markers occurs priorto removal of one or more high abundance materials. By denaturing suchmarkers prior to their removal, bound analytes of interest are releasedsuch that they can be meaningful in later analysis. Denaturation mayinvolve any technique known in the art including, for example, the useof heat, high salt concentrations, the use of acids, base, chaotropicagents, organic solvents, detergents and/or reducing agents. Liotta,Lance, A., et al., Nature (Oct. 30, 2003), Volume 425, page 905;Tirumalai, Radhakrishna S., et al. “Characterization of the LowMolecular Weight Human Serum Proteome,” Molecular & Cellular Proteomics2.10 (Aug. 13, 2003), pages 1096-1103.

In one embodiment, denaturation occurs prior to filtration with ahigh-molecular weight filter. This allows for the disassociation of lowmolecular weight components from large protein complexes. Following sizeseparation, the filtrate (low MW composition) may be concentrated anddesalted with a reverse phase resin in a solid phase extraction (SPE)format.

Sample Separation

After samples are prepared, markers including polypeptides of interestmay be separated or fractionated. Separation or fractionation can takeplace in the same location (manifold or microfluidic device) as thepreparation or in another location. In a preferred embodiment,separation occurs in the same microfluidic device where preparationoccurs, but in a different location on the device. Samples can beremoved from an initial manifold location to a microfluidic device usingvarious means, including an electric field. In one embodiment, thesamples are concentrated during their migration to the microfluidicdevice using reverse phase beads and an organic solvent elution such as50% methanol. This elutes the molecules into a channel or a well on aseparation device of a microfluidic device. In another embodiment,samples are concentrated by isotachophoresis, in which ions areconcentrated at a boundary between a leading and a trailing electrolyteof lower and higher electrophoretic mobilities, respectively. In otherembodiments, sample preparation occurs or sample fractionation usingconventional methods (e.g., pipettes and 96-well plates) and samples arethen transferred to a microfluidic device for separations.

Separation can involve any procedure known in the art, such as capillaryelectrophoresis (e.g., in capillary or on a chip/microfluidic device),or chromatography (e.g., in capillary, column or on a chip/microfluidicdevice).

(i) Electrophoresis

Electrophoresis separates ionic molecules such as polypeptides bydifferential migration patterns through an open capillary or openchannel or a gel based on the size and ionic charge of the molecules inan electric field. Electrophoresis can be conducted in a gel, capillaryor on a chip. Examples of capillaries used for electrophoresis includecapillaries that interface with an electrospray tip.

Capillary Gel Electrophoresis (CGE) separates ionic molecules through agel. Examples of gels used for electrophoresis include starch,acrylamide, agarose or combinations thereof. In a preferred embodiment,polyacrylamide gels are used. A gel can be modified by itscross-linking, addition of detergents, immobilization of enzymes orantibodies (affinity electrophoresis) or substrates (zymography) and pHgradient. Examples of capillaries used for electrophoresis includecapillaries that interface with an electrospray.

Capillary electrophoresis (CE) is preferred for separating complexhydrophilic molecules and highly charged solutes. Advantages of CEinclude its use of small samples (sizes ranging from 0.001 to 10 μL),fast separation, easily reproducible, and the ability to be coupled to amass spectrometer. CE technology uses narrow bore fused-silicacapillaries to separate a complex array of large and small molecules.High voltages are used to separate molecules based on differences incharge, size and hydrophobicity. Depending on the types of capillary andbuffers used, CE can be further segmented into separation techniquessuch as capillary zone electrophoresis (CZE), capillary isoelectricfocusing (CIEF) and capillary electrochromatography (CEC).

Capillary zone electrophoresis (CZE), also known as free-solution CE(FSCE), is the simplest form of CE. The separation mechanism of CZE isbased on differences in the size and charge of the analytes. Fundamentalto CZE are homogeneity of the buffer solution and constant fieldstrength throughout the length of the capillary. The separation reliesprincipally on the pH-controlled dissociation of acidic groups on thesolute or the protonation of basic functions on the solute.

Capillary isoelectric focusing (CIEF) allows amphoteric molecules, suchas polypeptides, to be separated by electrophoresis in a pH gradientgenerated between the cathode and anode. A solute migrates to a pointwhere its net charge is zero. At this isoelectric point (the solute'spI), migration stops and the sample is focused into a tight zone. InCIEF, once a solute has focused at its pI, the zone is mobilized pastthe detector by either pressure or chemical means.

CEC is a hybrid technique between traditional liquid chromatography(HPLC) and CE. In essence, CE capillaries are packed with HPLC packingand a voltage is applied across the packed capillary, which generates anelectro-osmotic flow (EOF). The EOF transports solutes along thecapillary towards a detector. Both differential partitioning andelectrophoretic migration of the solutes occurs during theirtransportation towards the detector, which leads to CEC separations. Itis therefore possible to obtain unique separation selectivities usingCEC compared to both HPLC and CE. The beneficial flow profile of EOFreduces flow related band broadening and separation efficiencies ofseveral hundred thousand plates per meter are often obtained in CEC. CECalso makes it is possible to use small-diameter packings and achievevery high efficiencies.

Alternatively, isotachophoresis (ITP) is a method of concentratingsamples by electrophoretic separation using a discontinuous buffer. SeeOsbourn, D. M., et al., “On-line Preconcentration Methods for CapillaryElectrophoresis” Electrophoresis 2000, 21, 2768-2779. In ITP, chargedmolecules are concentrated at a boundary between a leading and atrailing electrolyte of lower and higher electrophoretic mobility,respectively. The technique can be used in conjunction with capillaryelectrophoresis where a discontinuous electrolyte system is preferablyemployed at the site of sample injection into the capillary.

Moreover, transient isotachophoresis (tITP) is a variation of thistechnique commonly used in conjunction with capillary electrophoresis(CE). Foret, F., et al. describes two electrolyte arrangements forperforming tITP. Trace Analysis of Proteins by Capillary ZoneElectrophoresis with On-Column Transient IsotachophoreticPreconcentration. Electrophoresis 1993, 14, 417-428 (1993).

One configuration employs two reservoirs connected by a capillary. Thecapillary and one reservoir are filled with a leading electrolyte (LE),while the second reservoir is filled with terminating electrolyte (TE).The sample for analysis is first injected into the capillary filled withLE and the injection end of the capillary is inserted into the reservoircontaining TE. Voltage is applied and those components of the samplewhich have mobilities intermediate to those of the LE and TE stack intosharp ITP zones and achieve a steady state concentration. Theconcentration of such zones is related to the concentration of the LEco-ion but not to the concentration of the TE. Once a steady state isreached, the reservoir containing TE is replaced with an LE containingreservoir. This causes a destacking of the sharp ITP zones, which allowsindividual species to move in a zone electrophoretic mode.

The other configuration discussed by Foret, F., et al. employs a similarapproach but uses a single background electrolyte (BGE) in eachreservoir. The mobility of the BGE co-ion is low such that it can serveas the terminating ion. The sample for analysis contains additionalco-ions with high electrophoretic mobility such that it can serve as theleading zone during tITP migration. After sample is injected into thecapillary and voltage is applied, the leading ions of higher mobility inthe sample form an asymmetric leading and sharp rear boundary. Justbehind the rear boundary, a conductivity discontinuity forms, whichresults in a non-uniform electric field, and thus stacking of the sampleions. As migration progresses, the leading zone broadens due toelectromigration dispersion and the concentration of higher mobilitysalt decreases. The result is decreasing differences of the electricfield along the migrating zones. At a certain concentration of theleading zone, the sample bands destack and move with independentvelocities in a zone electrophoretic mode.

In preferred embodiments, the samples are separated on using CE, morepreferably CEC with sol-gels, or more preferably CZE. This separates themolecules based on their electrophoretic mobility at a given pH (orhydrophobicity in the case of CEC).

A separation channel in a separation microfluidic device of the presentinvention is preferably coated with a positive coating that reducesmolecular interactions at the low pH used in the system, and produces anelectro-osmotic flow of at least 10 nL/min, 20 nL/min, 30 nL/min, 40nL/min, 50 nL/min, 60 nL/min, 70 nL/min, 80 nL/min, 90 nL/min, 100nL/min, 110 nL/min, 120 nL/min, 130 nL/min, 140 nL/min, or 150 nL/min tofeed the electrospray process. Preferably, the electro-osmotic flow isof at least 100 nL/min. The microfluidic devices can separate all serumcomponents in under 12 minutes, with a separation efficiency of 100,000theoretical plates.

(ii) Chromatography

Chromatography is another method for separating a subset ofpolypeptides. Chromatography is based on the differential absorption andelution of certain polypeptides. Liquid chromatography (LC), forexample, involves the use of fluid carrier over a stationary phase.Conventional LC columns have an in inner diameter of roughly 4.6 mm anda flow rate of roughly 1 ml/min. Micro-LC has an inner diameter ofroughly 1.0 mm and a flow rate of roughly 40 μL/min. Capillary LCutilizes a capillary with an inner diameter of roughly 300 um and a flowrate of approximately 5 μL/min. Nano-LC is available with an innerdiameter of 10-300 μm or 50 um-1 mm and flow rates of 10-200 nl/min.Nano-LC can vary in length (e.g., 5, 15, or 25 cm) and have typicalpacking of C18, 5 um particle size. Nano-LC stationary phase may also bea monolithic material, such as a polymeric monolith or a sol-gelmonolith. In a preferred embodiment, nano-LC is used. Nano-LC providesincreased sensitivity due to lower dilution of chromatographic sample.The sensitivity of nano-LC as compared to HPLC can be as much as 3700fold.

Ionization

Once prepared and separated, the markers (e.g., polypeptides or smallmolecules) are automatically delivered to a detection device, whichdetects the markers (e.g., polypeptides or small molecules) in a sample.In a preferred embodiment, markers (e.g., polypeptides or smallmolecules) in solution are delivered to a detection device byelectrospray ionization (ESI). ESI operates by infusing a liquidcontaining the sample of interest through a channel or needle, which iskept at a potential (typically 3.5 kV). The voltage on the needle causesthe spray to be charged as it is nebulized. The resultant dropletsevaporate at atmospheric pressure or in a region maintained at a vacuumas low as several ton, until the solvent is essentially completelystripped off, leaving a charged ion. The charged ions are then detectedby a detection device such as a mass spectrometer.

In a more preferred embodiment, nanoelectrospray ionization is used.Nanospray ionization is a miniaturized version of ESI and provides lowdetection limits using extremely limited volumes of sample fluid.

Ions formed by electrospray ionization normally are singly or multiplycharge ions of molecules, with charge coming from protons or alkalimetal bound to the molecules. Ion excitation may be produced bycollision of ions with background gas or an introduced collision gas,e.g., collision induced dissociation (CID). Alternatively, excitationmay be from collision with other ions, a surface, interaction withphotons, heat, electrons, or alpha particles. Through excitation of thesample in an electrospray, the information content of the process shouldbe altered and/or enhanced. Such excitation may, for example, desolvateions, dissociate non-covalently bound molecules from analyte ions, breakup solvent clusters, fragment background ions to change their mass tocharge ratio and move them to a ratio that may interfere less with theanalysis, strip protons and other charge carriers such that multiplycharged ions move to different regions of the spectrum, and fragmentanalyte ions to produce additional, more specific or sequence-relatedinformation.

In preferred embodiments of the invention, the selected excitationsystem may be turned “on” and “off” to obtain a set of spectra in bothstates. The information content of the two spectra is, in most cases,far greater than the information content of either single spectrum. Insuch embodiments, the system includes a switching device for activatingand de-activating the excitation/ionization system. Analysis softwarewhich is part of the informatics tools herein may be configured toanalyze the sample separately both in the “on” state of the excitationsystem and in the “off” state of the excitation system. Differentmarkers may be detected more efficiently in one or the other of thesetwo states.

In preferred embodiments, separated markers, including optionallypolypeptides, are directed down a channel that leads to an electrosprayionization emitter, which is built into a microfluidic device (anintegrated ESI microfluidic device). Preferably, such integrated ESImicrofluidic device provides the detection device with samples at flowrates and complexity levels that are optimal for detection. Such flowrates are, preferably, approximately 1-1000 nL/min, 10-800 nL/min,20-600 nL/min, 30-400 mL/min, 40-300 nL/min, or more preferablyapproximately 50-200 nL/min.

Furthermore, a microfluidic device is preferably aligned with adetection device for optimal sample capture. For example, using dynamicfeedback circuitry, a microfluidic device may allow for controlpositioning of an electrospray voltage and for the entire spray to becaptured by the detection device orifice. The microfluidic device can besold separately or in combination with other reagents, software toolsand/or devices.

In any of the embodiments herein, pressure may be added to move a samplethrough a separation device and maintain a stable flow into thedetection device. Such pressure may be applied after at least partialpreparation of the sample or complete preparation of the sample. Suchpressure can be added using a buffered solution whichincreases/maintains the flow rate of the liquid-containing sample. Suchbuffer can form a “sheath” around the sample and help sample componentsmigrate to the end of an electrophoretic separation capillary and intothe detection device. Such sheath may also dilute the sample beingdetected.

In some embodiments, the invention contemplates methods for sheathlessionization. In one embodiment, a sheathless ionization element providesvoltage from a second channel to produce enough energy to generate theelectrospray. In another embodiment, an electrical contact at the spraytip provides the voltage to generate the electrospray.

FIG. 11 is an exemplary embodiment of a microfluidic device having asheathless ionization element. The microfluidic device in FIG. 11 has acurved separation channel 1101, a second channel 1110 for application ofthe electrospray/electrophoresis voltage, and the electrospray emittertip 1120. Sample is inputted in the well at sample input location 1103and exits in the well at sample output location 1104, while separationbuffer is inputted in the well at location 1102. The emitter tip 1120 isprotected from mechanical damage by plastic extensions on either side.The microfluidic device is preferably made of a polymeric material, suchas plastic, and is disposable. Thus it is contemplated by the presentinvention that an electrospray emitter is integrated with thepreparation/separation microfluidic device which is also polymeric anddisposable.

In preferred embodiments, the samples are separated on using capillaryelectrophoresis separation, more preferably CEC with sol-gels, or morepreferably CZE. This will separate the molecules based on theirelectrophoretic mobility at a given pH (or hydrophobicity in the case ofCEC).

FIG. 13 shows the microfluidic device in an expanded view of theelectrospray emitter tip. The side channel 1310 is uncoated so noelectro-osmotic flow is generated. Positive analyte ions from theseparation channel 1320 do not move into the side channel because theirelectrophoretic mobility is in the opposite direction. Thus, all of theanalyte ions are sprayed from the tip 1330 without the dilution effectthat is common to similar interfaces that use a sheath. Voltages for theseparation and electrospray are provided either to liquids in wells orelectrodes in the microfluidic device, which prevents bubble formationin the channels or at the tip due to hydrolysis. The electrosprayvoltage at the tip is determined by the ratio of the electricalconductivities of the separation and side channels. The voltage providedby side channel 1310 may be, for example, less than 10V, 5V, 1V, 0.5V,0.1V, 0.05V, 0.01V, or between 0.0001-10 V, between 0.001-1V, or between0.01 and 0.1V. No additional electrode or tip electrical coating, asfound on other integrated electrospray tips for sheathless electrosprayinterfacing, is used. A voltage controller has been designed to providethe high voltages to each well on the chip, and to change them in propersequence for sample loading, injection, and separation. Importantly, thevoltages are floated with respect to a common, permitting theelectrospray voltage to be changed without altering the potentialdifferences between electrodes that drive the separation.

In either sheath or sheathless system, buffers may be used to improvesignal intensity and/or carry the voltage charge. Examples of buffersthat can be used in a sheath or sheathless system include, but are notlimited to, 10-50% methanol 10-50% ethanol, 10-50% n-propanol, 10-50%isopropanol, each including 10-100 nM acetic acid or formic acid.

The selected buffer system can be fully volatile, and moreover, in-linetransient isotachophoresis can be employed to further improve signalintensity.

In one embodiment, the present invention relates to a sheathless-ESIinterface that couples a capillary electrophoresis (CE) microfluidicsdevice to a time-of-flight (TOF) mass spectrometer for the automatedseparation and detection of intact polypeptides in human serum. Thesheathless interface provided in this embodiment of the invention isoften preferred for its relatively improved inherent sensitivity. Tofurther increase sensitivity, it may be preferable under particularconditions to employ transient isotachophoresis (tITP) to concentrate asample on-line.

In some embodiments, pressure is added using a combination of sheath andsheathless processes.

Calibrants can also be sprayed into detection device. Calibrants areused to set instrument parameters and for signal processing calibrationpurposes. Calibrants are preferably utilized before a real sample isassessed or at the same time a real sample is assessed. Calibrants caninterface with a detection device using the same or a separate interfaceas the samples. In a preferred embodiment, calibrants are sprayed into adetection device using a second interface (e.g., second spray tip).

Microfluidic Devices

In some of the embodiments herein, sample preparation and/or separationoccur on a microfluidic device. In other preferred embodiments, thesteps of sample preparation and separation are combined usingmicrofluidics technology. A microfluidic device is a device that cantransport liquids including various reagents such as analytes andelutions between different locations using microchannel structures.Microfluidic devices provide advantageous miniaturization, automationand integration of a large number of different types of analyticaloperations. For example, continuous flow microfluidic devices have beendeveloped that perform serial assays on extremely large numbers ofdifferent chemical compounds. Microfluidic devices may also provide thefeature of disposability, to prevent sample carry-over. By microfluidicdevice it is intended to mean herein devices with channels smaller than1000 μm, preferably less than 500 μm, and more preferably less than 100μm. Preferably such devices use sample volumes of less than 1000 μl,preferably less than 500 and most preferably less than 100 μl.

Preferably, both sample preparation and separation occur on microfluidicdevice(s). More preferably, both sample preparation and sampleseparation occur on the same microfluidic device. Optimally, any of theabove, or more preferably a single preparation/separation microfluidicdevice interfaces directly or indirectly with a detection device.Preferably, the microfluidic devices are disposable, meaning that theyare marketed for one or a few uses followed by disposal and replacement.Preferably, sample preparation occurs using conventional methods, whileseparation occurs on a microfluidic device.

The microfluidic devices herein are preferably polymeric and/ordisposable. A microfluidic devices (or chip) may be formed in anymaterial known in the art. In some embodiments, a microfluidic deviceherein is formed from a polymer such as plastic by means of, forexample, etching, machining, cutting, molding, casting or embossing. Insome embodiments, the microfluidic devices can be made from glass orsilicon by means of, for example, etching, machining, embossing, orcutting. In some embodiments, the microfluidic devices may be formed bypolymerization on a form or other mold. Preferably, the microfluidicdevices may be fabricated by hot embossing of PMMA and the channels aresealed by lamination with a 75 um PMMA film.

A positively-charged coating can then be applied to the separationchannel after lamination. A microfluidic device can provide multipleintegrated operations as well as fast separations, efficientelectrospray ionization, high throughput, zero carry-over betweensamples, and reliable, reproducible, connection-free fluid junctions.The particular operations performed by the microfluidic devices hereindepend, in part, upon the detection technology that is utilized.

A mass spectrometer of the present invention, preferably contains adisposable inlet capillary(ies) for receiving spray from a microfluidicdevice. Inlet capillaries can be made with high precision, and mating ofhardware to the mass spectrometer can be performed by a person ofordinary skill in the art. A capillary within a mass spectrometer hereinis preferably designed to include a faceplate to avoid the need to cleanthe outside face of the MS inlet. Furthermore, the inlet capillary couldbe connected directly or indirectly to the electrospray emitter.Preferably, the orientation and/or proximity of the emitter tip to theinlet capillary is pre-determined and does not need to be set oradjusted by the user. Some of the benefits of the capillary inlets isthat it allows an operator to simply replace the mass spectrometer'sinlet capillary assembly as opposed to having to dismantle and clean theentire source of the mass spectrometer.

A microfluidic device can transport liquids including various reagentssuch as analytes and elutions between different locations usingmicrochannel structures. Microfluidic devices provide advantageousminiaturization, automation and integration of a large number ofdifferent types of analytical operations. For example, continuous flowmicrofluidic devices have been developed that perform serial assays onextremely large numbers of different chemical compounds. Microfluidicdevices may also provide the feature of disposability, to prevent samplecarry-over.

By microfluidics device it is intended to mean devices with channelshaving a channel width smaller than 1000 μm, 900 μm, 800 μm, 700 μm, 600μm, 500 μm, 400 μm, 300 μm, 200 μm, 100 μm, 50 μm or 10 μm and a channelheight of the same or similar dimension. In some embodiments, suchdevices perform functions on a sample having volume less than 1000 nL,900 nL, 800 nL, 700 nL, 600 nL, 500 nL, 400 nL, 300 nL, 200 nL, 100 nL,50 nL, 10 nL, 5.0 nL, 1.0 nL, 0.5 nL, 0.1 nL or less.

The microfluidic devices may be either single use for a single sample;multi-use for a single sample at a time with serial loading; single usewith parallel multiple sample processing; multi-use with parallelmultiple sample processing; or a combination. Furthermore, more than onemicrofluidic device may be integrated into the system and interface witha single detection device. In preferred embodiments, the microfluidicdevice is a disposable device that is readily connected to and removedfrom the mass spectrometer, and sold as a disposable, thereby providinga recurring revenue stream to the involved business and a reliableproduct to the consumer. Preferably, the disposable product is forsingle use only. In some embodiments, the disposable microfluidic deviceis for multiple uses. Preferably, a mass spectrometer that accepts acontinuous sample stream for analysis and provides high sensitivitythroughout the detection process is utilized. Preferably, any reagentsused for preparation/separation are provided in or along with themicrofluidic device, thereby allowing for additional recurring revenueto the business herein and higher performance for the user. In some ofthe embodiments herein, the microfluidic device(s) have a sheathlessionization interface.

It is further contemplated that after detection of a marker, thebusiness herein may further develop diagnostic products based on suchmarker. A diagnostic product for a polypeptide marker can include, forexample, an antibody (polyclonal, monoclonal, humanized, or a fragmentthereof) or other agent that can detect the presence/absence or level ofa marker in a sample.

The business methods herein also contemplate providing diagnosticservices to, for example, health care providers, insurers, patients,etc. The business herein can provide diagnostic services by eithercontracting out with a service lab or setting up a service lab (underClinical Laboratory Improvement Amendment (CLIA) or other regulatoryapproval). Such service lab can then carry out the methods disclosedherein to identify if a particular pattern and/or marker is within asample.

Once prepared and separated, the polypeptides are automaticallydelivered to a detection device, which detects the polypeptides in asample. In a preferred embodiment, polypeptides in elutions or solutionsare delivered to a detection device by electrospray ionization (ESI).ESI operates by infusing a liquid containing the sample of interestthrough a channel or needle, which is kept at a potential (typically 3.5kV). The voltage on the needle causes the spray to be charged as it isnebulized. The resultant droplets evaporate in a region maintained at avacuum of several torr, until the solvent is essentially completelystripped off, leaving a charged ion. The charged ions are then detectedby a detection device such as a mass spectrometer. In a more preferredembodiment, nanospray ionization (NSI) is used. Nanospray ionization isa miniaturized version of ESI and provides low detection limits usingextremely limited volumes of sample fluid.

In preferred embodiments, separated polypeptides are directed down achannel that leads to an electrospray ionization emitter, which is builtinto a microfluidic device (an integrated ESI microfluidic device).Preferably, such integrated ESI microfluidic device provides thedetection device with samples at flow rates and complexity levels thatare optimal for detection. Such flow rates are, preferably,approximately 50-200 uL/min. Furthermore, a microfluidic device ispreferably aligned with a detection device for optimal sample capture.For example, using dynamic feedback circuitry, a microfluidic device mayallow for control positioning of an electrospray voltage and for theentire spray to be captured by the detection device orifice. Themicrofluidic device can be sold separately or in combination with otherreagents, software tools and/or devices.

Calibrants can also be sprayed into detection device. Calibrants areused to set instrument parameters and for signal processing calibrationpurposes. Calibrants are preferably utilized before a real sample isassessed. Calibrants can interface with a detection device using thesame or a separate interface as the samples. In a preferred embodiment,calibrants are sprayed into a detection device using a second interface(e.g., second spray tip).

Detection

Detection devices can comprise any device or use any technique that isable to detect the presence and/or level of a composition in a sample.Examples of detection techniques that can be used in a detection deviceinclude, but are not limited to, nuclear magnetic resonance (NMR)spectroscopy, 2-D PAGE technology, Western blot technology,immunoanalysis technology, electrochemical detectors, spectroscopicdetectors, luminescent detectors, and mass spectrometry.

In a preferred embodiment, the system or business model herein relies ona mass spectrometry to detect biomarkers, such as polypeptides, presentin a given sample. There are various forms of mass spectrometers thatmay be utilized.

In a preferred embodiment, an ESI-MS detection device is utilized. AnESI-MS combines the novelty of ESI with mass spectrometry. Furthermore,an ESI-MS preferably utilizes a time-of-flight (TOF) mass spectrometrysystem. In TOF-MS, ions are generated by whatever ionization method isbeing employed and a voltage potential is applied. The potentialextracts the ions from their source and accelerates them towards adetector. By measuring the time it takes the ions to travel a fixeddistance, the mass of the ions can be calculated. TOF-MS can be set upto have an orthogonal-acceleration (OA). OA-TOF-MS are advantageous andpreferred over conventional on-axis TOF because they have betterspectral resolution and duty cycle. OA-TOF-MS also has the ability toobtain spectra at a relatively high speed. See Brock et al. Anal. Chem(1998) 70, 3735-41, discuss on-axis TOF known as Hadamard OA-TOF-MS. Inaddition to the MS systems disclosed above, other forms of ESI-MSinclude quadrupole mass spectrometry, ion trap mass spectrometry,orbitrap mass spectrometry, and Fourier transform ion cyclotronresonance (FTICR-MS).

Quadrupole mass spectrometry consists of four parallel metal rodsarranged in four quadrants (one rod in each quadrant). Two opposite rodshave a positive applied potential and the other two rods have a negativepotential. The applied voltages affect the trajectory of the ionstraveling down the flight path. Only ions of a certain mass-to-chargeratio pass through the quadrupole filter and all other ions are thrownout of their original path. A mass spectrum is obtained by monitoringthe ions passing through the quadrupole filter as the voltages on therods are varied.

Ion trap mass spectrometry uses three electrodes to trap ions in a smallvolume. The mass analyzer consists of a ring electrode separating twohemispherical electrodes. A mass spectrum is obtained by changing theelectrode voltages to eject the ions from the trap. The advantages ofthe ion-trap mass spectrometer include compact size, and the ability totrap and accumulate ions to increase the signal-to-noise ratio of ameasurement

Orbitrap mass spectrometry uses spatially defined electrodes with DCfields to trap ions. Ions are constrained by the DC field and undergoharmonic oscillation. The mass is determined based on the axialfrequency of the ion in the trap.

FTICR mass spectrometry is a mass spectrometric technique that is basedupon an ion's motion in a magnetic field. Once an ion is formed, iteventually finds itself in the cell of the instrument, which is situatedin a homogenous region of a large magnet. The ions are constrained inthe XY plane by the magnetic field and undergo a circular orbit. Themass of the ion can now be determined based on the cyclotron frequencyof the ion in the cell.

In a preferred embodiment, the system or business model herein employs aTOF mass spectrometer, or more preferably, an ESI-TOF-MS, or morepreferably an OA-TOF-MS, or more preferably a mass spectrometer having adual ion funnel and that supports dynamic switching between multiplequadrupoles in series, the second of which can be used to dynamicallyfilter ions by mass in real time. In preferred embodiments, thedetection device yields spectra at a rate of more than 0.1, 0.2, 0.3,0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 secondsper spectra. In preferred embodiments, the detection device yields aspectrum of at least 150, more preferably 200, or more preferably 300spectrums per second.

The detection device preferably interfaces with a separation/preparationdevice or microfluidic device, which allows for quick assaying of manyof the polypeptides in a sample, or more preferably, most or all of thepolypeptides in a sample. Preferably, a mass spectrometer is utilizedthat accepts a continuous sample stream for analysis and provide highsensitivity throughout the detection process (e.g., an ESI-MS). Inanother preferred embodiment, a mass spectrometer interfaces with one ormore electrosprays, two or more electrosprays, three or moreelectrosprays or four or more electrosprays. Such electrosprays canoriginate from a single or multiple microfluidic devices.

In some preferred embodiments, the system herein employs a TOF massspectrometer, or more preferably, an ESI-TOF-MS, or more preferably anESI-OA-TOF-MS. In preferred embodiments, a mass spectrometer may have asingle or dual ion funnel(s) and that supports dynamic switching betweenmultiple quadruples in series, the second of which can be used todynamically filter ions by mass in real time. Such MS detection devicesare described in more detail in Belov, M. E., et al. (2000) J Am SocMass Spectrom 11, 19-23 and Belov, M. E., et al. (2000) Anal Chem 72,2271-9.

FIG. 14 illustrates an exemplary embodiment of a detection device of thepresent invention.

In some embodiment, an injection volume of the microfluidic device isless than 10 nL, 9 nL, 8 nL, 7 nL, 6 nL, 5 nL, 4 nL, 3 nL, 2 nL, 1 nL,0.9 nL, 0.8 nL, 0.7 nL, 0.6 nL, 0.5 nL, 0.4 nL, 0.3 nL, 0.2 nL, or 0.1nL. In some embodiments, less than 500 μL, 400 μL, 300 μL, 200 μL, 100μL, 90 μL, 80 μL, 70 μL, 60 μL, 50 μL, 30 μL, 20 μL, 10 μL, 9 μL, 7 μL,5 μL, 4 μL, 3 μL, 2 μL, or 1 μL of a sample or less is analyzed perassay.

The instrument has features for ion accumulation, ion selection, andscan overlapping that are being developed to improve sensitivity andcapability further, and it can be configured for tandem massspectrometry.

The detection system utilized preferably allows for the capture andmeasurement of most or all of the components (e.g., markers andpolypeptides) that are introduced into the detection device. It ispreferable that one can observe components (e.g., markers andpolypeptides) with high information-content that are only present at lowconcentrations. By contrast, it is preferable to remove those in advancethat are, for example, common to all cells, especially those in highabundance.

The detection devices herein can be used singly or in combination withone another.

Informatics

The output from a detection device can then be processed, stored, andfurther analyzed or assayed using a bio-informatics system. Abio-informatics system can include one or more of the following: acomputer; a plurality of computers connected to a network; a signalprocessing tool(s); a pattern recognition tool(s); and optionally atool(s) to control flow rate for sample preparation, separation, anddetection.

Quality Assurance

Quality assurance methods are used to ensure that devices and/orinstrumentations herein function properly and that outliers arediscovered before discriminatory patterns are sought. Generally, qualityassurance uses metrics including, but not limited to, total intensity ofa spectrum, intensity of calibrants, intensity of expected peaks,resolution of calibrants, resolution of expected peaks, mass accuracy ofcalibrants, mass accuracy of expected peaks, ratios of intensities ofpeaks or other metrics alone or in combinations to eliminate data thatshould not be further analyzed due to issues such as, but not limitedto, data acquisition problems or sample collection problems.

Signal Processing

Data/signal processing utilizes mathematical foundations. Generally,dynamic programming or non-linear fitting is preferably used to align aseparation axis with a standard separation profile. Furthermore,intensities may be normalized, preferably by dividing by the total ioncurrent of a spectrum or by dividing by the intensity of a calibrant, orusing quantile normalization methods or by fitting roughly 90% of theintensity values into a standard spectrum. The data sets are then fittedusing wavelets or other methods that are specifically designed forseparation and mass spectrometer data. Data processing preferablyfilters out some of the noise and reduces spectrum dimensionality. Thisallows the system or business to identify the more highly predictivepatterns.

Data/signal processing may involve the use of mathematical algorithms.Such signal processing can combine statistical and machine learningapproaches to isolate the information-rich data features (e.g. forwardand backward selection or ranking by univariate statistics, combinedwith Support Vector Machines and Kernel Discriminant Analysis), therebyreducing the dimensionality of the data and determining the combinationsof these features that are highly predictive of a biological state orcondition of interest. Rigorous cross-validation, false discovery rateanalysis, and the use of independent validation sets remove issues withoverfitting of data and bias in the study and allow finding more highlypredictive and robust patterns that are more generalizable (i.e.,patterns that are useful for analyzing other samples sets).

In some embodiments, data/signal processing may also involve thecalibration of a mass-axis using linear correction determined by thecalibrants. Calibration can take place prior to any sample detection;after sample detection; or in recurring intervals, for example.

A signal processing device herein can process data consisting of atleast 100, 200, 300, 400, 500, 600, 700, 700, 900, 1000, 5000, or 10,000spectra, or at least 100, 200, 300, 400, 500, 600, 700, 700, 900, 1000,5000, or 10,000 spectra/hour.

Thus, in any of the embodiments herein, data/signal processing caninvolve one or more of the following steps: (i) correcting for any lackof experimental reproducibility, (ii) noise reduction/removal, and (iii)dimensionality reduction.

(i) Correcting for Lack of Experimental Reproducibility

Artifacts can be corrected using intensity normalization,transformation, and separation time alignment. Under this method, theintensity at each point in a spectrum is divided by the Total IonCurrent (TIC) or by the intensity of a calibrant or by quantilenormalization. This puts intensity on an absolute scale and allowscomparisons across spectra. Additionally, each intensity value can bereplaced by its square root (or log) to stabilize variances. Dynamicprogramming or non-linear fitting can be used to correct for any localor global contractions or dilations in the time in which componentselute off the separations channel or column. A global alignment acrossall samples or an alignment to a standard spectrum can also beperformed. These approaches increase the precision of data and allow thecomparison of spectrum with the correct corresponding spectrum in adifferent data set, even if the separations in the two experiments weredifferent.

(ii) Noise Reduction/Removal

Standard denoising methods, such as Savitzky-Golay, as well as othermethods using wavelet and Fourier transforms can be used to reduceexperimental artifacts. Such methods remove high frequency noise in aspectrum without altering the generally lower frequency signal.

(iii) Dimensionality Reduction

Experimental artifacts can be reduced by reduction of dimensionality.Dimensionality reduction is used to reduce the number of dimensions to˜1000 s and greatly reduce the risk of classifying based on noise. Thereduction in the number of data features gives greater statisticalassurance that patterns analyzed are predictive and generalizable.Examples of methods used for dimensionality reduction include, forexample, simple models of throwing out data points with high P-values ina univariate statistical test and more complex models that use SupportVector Machines (SVMs) in an iterative manner.

Any of the signal processing tools above may include or be coupled toother software elements as well. For example, the signal processingsystem may provide for an easy to use user interface on the associatedcomputer system and/or a patient database for integration of resultsinto an institution's laboratory or patient information database system.

Pattern Recognition

Following data processing, pattern recognition tools are utilized toidentify differences between biological or phenotypic states orconditions that may affect an organism. Pattern recognition tools arebased on a combination of statistical and computer scientificapproaches, which provide dimensionality reduction. Such tools arescalable.

Pattern recognition methods take as input the normalized, aligned,de-noised and dimensionally reduced data sets and find patterns thatclassify the patients into classes (for example, case versus control).The present invention contemplates any pattern recognition method knownin the art, but preferably one or more of the following: Support VectorMachines, Discriminant Analysis, k-Nearest Neighbor, and NearestShrunken Centroid. Additional pattern recognition algorithms are alsocontemplated by the methods herein.

Pattern recognition methods can be used to find, for example, sets ofdata points (e.g., m/z values) that distinguish samples (e.g., casesfrom controls). Preferably, a three-fold cross validation is used todiscover and test patterns found using the above techniques. Three-foldcross validation means that the dataset is divided into thirds, whereone third is set aside as a test set and the other two thirds are usedas a training set. This is performed three times, using a differentthird of the data as the test set each time. The training data is usedto select features and find patterns that distinguish between the twogroups (e.g., breast cancer and healthy). The test set is then used toassess how well the patterns perform on independent and blind data. Suchcross validation methodology is very important in supervised learning,since it insures that the predictive power of the pattern is assessedusing a test set and thus is not biased. If such methods are not used,it is possible that data may be overfit and patterns discovered may notbe generalizable (i.e. not translate to new independent data and newpopulations). Thus, patterns discovered using the methods herein can beconverted into simple decision algorithms in a diagnostic setting.

In some embodiments, pattern recognition methods utilize hierarchicalclustering, which is an unsupervised pattern recognition method. Thismethod does not use information on the biological state of interest, butrather tries to organize the data into clusters based only oninformation found in the data. Such a method is especially useful foridentifying sub-groupings within the data. For example, there may besubgroups of breast cancer that are due to known factors (e.g., Her2/neuoverexpression) or due to unknown factors that have biologicalsignificance and could be the basis for further research. Suchclassifications may be important for understanding prognosis.

Data are analyzed in several ways. First univariate statistics are usedto find single data points that correlate with the presence/absence of abiological or condition of interest. Such methods can be used eitherwith or without prior signal processing. Standard non-parametricmethods, such as non-parametric versions of the t-test (Mann-Whitneytest) corrected for multiple comparisons by, for example, a Bonferronicorrection are used to analyze the data. After ranking by P-value, thedata is visualizes data points with low P-values and high group-meandifferences are reported.

A suite of advanced signal processing and pattern classification methodsmay optionally be used to find patterns in the data that are indicativeof the presence/absence of a biological state or condition of interest.Data analysis pipelines have been constructed from various methods ofboth signal processing and pattern recognition. Such pipelines may findrelevant signals in complex data as well as very good discriminatorypatterns. Sensitivities and specificities—as well as other relevantstatistics such as area under the curve (AUC) of the receiver operatorcharacteristic (ROC) curve and positive/negative predictive value—ofpatterns of data points that can highly discriminate between classes arereported. Examples of signal processing and pattern recognition methodsused are described in more detail below.

In the case that a pattern of markers for a biological state of interest(e.g., a condition such as disease) is discovered or known and we wantto assay another sample to determine if that patient has the disease,data could be analyzed as follows. After separation time alignment withdynamic programming or non-linear fitting, the intensities of datapointscorresponding to the markers of interest could be normalized by dividingby the total ion current, the intensity of a calibrant, or by quantilenormalization. The normalized intensities may then be log or square roottransformed, or left as is. The resulting intensities would be combinedas instructed in the discovery data analysis to yield a single numberthat would predict the biological or disease state of the patient. Inthis case, when assaying additional samples, no feature selection andpattern recognition would be used since the pattern would already beknown.

EXAMPLES

The following prophetic example illustrates certain aspects of theinvention.

Approximately one to five ml of blood will be collected throughvenipuncture into special tubes that contain the appropriatecalibrants/controls. Following thorough clot formation, serum will beisolated from sample following centrifugation. Serum sample will bealiquoted and frozen at −70 C until analysis. On the order of 100 uL ofthawed sample will be placed in a disposable plastic device that fitsinto a manifold, and hereafter, the entire process would be automated.The device will perform electrodialysis on the sample. Using an electricfield and tangential flow, the sample will be passed through a membranethat allows only molecules under approximately 30 kD (not a sharpcutoff) to pass through into a second chamber. Molecules of with theopposite charge or large molecules will not pass. A second membrane witha very low molecular weight cutoff (˜500 D) will allow small moleculesto pass out of the second chamber. Molecules that remain in the secondchamber will therefore be in a MW range (500 D-30 kD). Most of thesemolecules will be peptides, protein fragments and small proteins. Saltswill have been removed, as will most of the abundant polypeptides, suchas albumin. This process should take approximately 60 minutes.

The molecules of interest (i.e. those that remain in the second chamber)will then be moved to another location on the disposable device, againusing an electric field, and onto reverse phase beads for sampleconcentration. Using an organic solvent elution such as 50% methanol,the molecules will be eluted into a channel or well on a seconddisposable device, this time a microfluidics chip. On this chip, a 1-5minute capillary electrophoretic separation, CZE or CEC, will be run toseparate the molecules on the basis of electrophoretic mobility at thegiven pH (or hydrophobicity in the case of CEC). Preferred separationpeak widths under 1 second will be utilized.

Separated molecules will be directed down a channel that leads to aelectrospray ionization emitter that is built onto each chip. Expectedflow rates are 50-200 uL/min. Prior to starting the separation, themicrofluidics device will be aligned with the mass spectrometer usingdynamic feedback circuitry to optimally control positioning stageplacement and electrospray voltage to establish a stable spray and,assuming appropriate nl flow rates, allow the entire spray to becaptured in the mass spectrometer orifice. Standards/calibrants wouldalso be sprayed into the mass spectrometer using a dedicated secondspray tip and used to set instrument parameters and for signalprocessing calibration purposes before the real samples are run.

An orthogonal multiplexed mass spectrometer captures the spray from theprepared/separated sample (given that it is separated, the moleculeswill be migrating in small groups) and yield a spectrum at a rate of 200spectrum/s. The mass spectrometer incorporates a dual ion funnel tosupport dynamic switching between calibrants and analyte sprays tooptimize instrument accuracy. The instrument contains multiplequadrapoles in series, the second of which can, in real time during adata acquisition run, be used to dynamically filter ions by mass, thusallowing increased dynamic range or focus on particular mass ranges ofinterest. The orthogonal Multiplexed implementation allows multiple ionpackets to fly in the flight tube while at the same time decoupling massaccuracy from beam modulation rate, thus supporting high throughput,high sensitivity, and high mass resolution.

A resulting data set from one sample would have on the order of 10⁹ datapoints. Each data set would take approximately 5 minutes to collect,from start to finish. While a data set is being analyzed, a secondsample could be run through the system to increase throughput.

Each data set would have its mass axis calibrated through a linearcorrection determined by the calibrants run before the sample and by thecalibrants run in parallel in the dual ion funnel. Then dynamicprogramming would be used to align the separations axis (using the TIC)to some standard separations profile. Intensities would then benormalized by fitting the 90% intensity values to a standard spectrum.

These corrected data sets would then be fit using wavelets (orvaguelettes) that are specifically designed for separations/massspectrometer data. The parameterized information about the spectrumwould be soft thresholded and otherwise filtered to both remove noiseand reduce dimensionality.

During pattern discovery, a set of approximately 50 case and 50 controlsof these filtered parameter sets would be entered into a patternrecognition tool such as a linear support vector machine, but probablymultiple learning algorithms will be used on each data set. The space oftunable parameters for the learning machine will be searched, andoptimal patterns that distinguish the sample classes will be found, aswould be error bounds on that prediction using cross-validation.

During validation or in clinical assay, the filtered parameters fromeach new data set would be classified into a category by identifyingwhich side of the decision boundary in the multidimensional parameterspace that data set lies. Confidence intervals could also be calculated.This prediction and confidence interval would be reported back to thetechnician running the machine. In some embodiments the informationabout these clinical samples would be captured and those results andclinical outcomes of those patients in pattern recognition using moresamples would be used, yielding better patterns to improveclassification.

Eventually, polypeptides/patterns that give rise to the most importantdata points for prediction could be identified using a tandem massspectrometry approach. Once a pattern is discovered, separations will beoptimized to increase the amount of information about the polypeptidesof interest, by slowing down separations during the elution of thosepolypeptides and speeding it up elsewhere. This would allow for the useof a separate, efficient assay for every diagnostic developed

It is to be understood that the above embodiments are illustrative andnot restrictive. The scope of the invention should be determined withrespect to the scope of the appended claims, along with their full scopeof equivalents.

Example 1

Automated separation and detection of intact polypeptides from selectedsamples was performed using a sheathless CE-ESI-MS system. The selectedCE-ESI-MS system was assembled from a combination of commerciallyavailable and custom-built instrumentation as follows.

Materials

The system included a Beckman P/ACE MDQ (Beckman Coulter, Fullerton,Calif.) with a cooled sample garage and an EDA cartridge to allow theseparations capillary to exit the instrument to the mass spectrometer.The MDQ was grounded to the chassis of the mass spectrometer when CE-MSwas performed.

The separations capillary was mated to the electrospray emitter via anADPT-PRO nanoelectrospray adapter (New Objective, Woburn, Mass.). Theadapter was used according to the instructions provided by themanufacturer. Briefly, the ends of the separation capillary and sprayemitter are inserted into a modified, plastic, zero-dead-volume unionand sealed in place with plastic finger-tight screws and sleeves.Voltage was applied via a metal adapter attached to the screw holdingthe emitter in place. The interface was mounted on an xyz positioningstage to allow adjustment of the emitter position relative to the inletof the mass spectrometer. A CCD camera (Model KP-M22AN, Hitachi Kokusai,Japan) was mounted to enable visualization of the spray and the positionof the emitter tip. For work with human serum, a plastic enclosure wasbuilt to enclose the interface in a chamber at a slight negativepressure.

Fused silica capillaries (360 μm OD, 50 μm ID) were purchased fromPolymicro Technologies (Phoenix, Ariz.). The inner surface was cleanedand derivatized with methacryloylaminopropyltrimethylammonium chloride(MAPTAC) according to a variation of the procedure of Kelly, J. F. inAnalytical Chemistry 1997, 69, 51-60. This produced a hydrophilic,positively-charged coating on the inner surface. Briefly, the capillaryis rinsed with sodium hydroxide for 45 minutes, water for 45 minutes,and methanol for 15 minutes to clean the surface. Next, the capillary issilanized by flushing a 0.5% v/v solution of 7-oct-enyltrimethoxysilanein acidified methanol (0.5% v/v acetic acid in methanol) overnightfollowed by 15-minute rinses of methanol and water. To initiatepolymerization, 40 μL of TEMED and 140 μL of 10% w/v freshly preparedAPS are added to a freshly prepared solution of 5% MAPTAC. The MAPTACsolution is then pumped through the capillary overnight, followed by aone-hour water rinse. After derivatization with poly-MAPTAC, thecapillaries were stored wet at 4° C. until use. Typically, two ˜3 mlengths of capillary were prepared at the same time and were referred toas a batch. The electroosmotic flow (EOF) was measured understandardized conditions on a segment from each batch of poly-MAPTACderivatized capillary and found to vary by less than 5% batch-to-batch.

Fused silica electrospray emitters (TT360-50-5-D-5) were purchased fromNew Objective (Woburn, Mass.) and derivatized with poly-MAPTAC accordingto the procedure described above. The emitters used for the patternrecognition experiment were purchased with a conductive coating appliedto the distal end. The frontal (tip) end is tapered from the outerdiameter of 360 μm to the inner diameter of 50 μm. After derivatization,emitters were stored submersed in water until use. Before use, emitterswere rinsed with acetone and cut carefully to 3 cm. The cleaned and cutemitters were inspected under a microscope for the integrity of thepolyimide and conductive coatings at the cut end of the emitter. Anyoverhanging coating material was carefully removed under microscopeobservation with a dental pick. Damaged emitters were not used and werediscarded.

Methods

Selected samples were separated by capillary electrophoresis (CE),subjected to electrospray ionization (ESI) and analyzed in a massspectrometer (MS) as follows. Electrophoresis was performed at aconstant −20 to −40 kV voltage in a 65-cm capillary coated internallywith poly-MAPTAC as described in the previous section. The run bufferwas 10-30% methanol and 20-80 mM acetic acid (pH 3.2). The stackingsolution was prepared by adding 5-10 μL of a stock of 5.02 N ammonia to1.5 mL of run buffer (pH 4.7). For the pattern recognition experiment,serum was injected for about 5 seconds at about 9.5 psi followed by thestacking buffer for about 5 seconds at about 4.8 psi. Under theseconditions, the EOF was approximately 5×10⁻⁴ cm²/V-sec.

To reduce evaporation, the bottom of a 2 mL Beckman P/ACE sample vialwas filled with 250-450 μL of run buffer. The serum sample wastransferred into a 200 μL PCR vial, suspended on a spring inside the 2mL vial, and capped before loading into the sample tray of the P/ACEMDQ. The sample garage of the MDQ instrument was kept at 4° C.

Before each injection of serum, the capillary was rinsed and conditionedby a series of five pressure rinse steps performed for 1-3 minutes at10-30 psi. The five solutions were in sequence: 75 mM ammonia in runbuffer, 1.8 M formic acid, water, 60 mM acetic acid, and run buffer.

The electrospray voltage was supplied independently by the massspectrometer. While developing this methodology, the electrosprayvoltage was adjusted manually to provide optimal spray stability anddetected signals, and was typically 2-3 kV. For selected experimentswith spiked serum for pattern recognition, the volumetric flowrate wasapproximately 280 nL/min, and the electrospray voltage was constant at2.3 kV. Furthermore, the mass spectrometer was operated in positive ionmode and was mass calibrated daily. The daily mass calibration may beparticularly important for informatics algorithms to perform optimally,as the algorithms are sensitive to drifts in the mass accuracy.

In the development of the separations methodology, an ABI Mariner(Applied Biosystems, Foster City, Calif.) time-of-flight massspectrometer was used as the detector. For the pattern recognitionexperiments involving serum, an in-house constructed orthogonal TOF massspectrometer with a two-stage ion reflector was used. In thisinstrument, ions were introduced into the extraction chamber afterpassing through an electrodynamic ion funnel/collisional quadrupoleassembly, selection quadrupole, and an Einzel lens arrangement. Thehome-built mass spectrometer was controlled and data acquired using asoftware program developed in a LabView environment (NationalInstruments, Austin, Tex.). The m/z resolution was typically 3500-4000for the +3 charge state of neurotensin, and the mass accuracy wastypically 3 ppm.

When performing CE-MS in automated mode, a relay-open step wasincorporated into the electrophoresis method file to trigger massspectral data acquisition. Instrument-specific parameters for the MDQand TOF-MS were controlled independently.

Results

Because detection limitations are an important factor in the discoveryof biomarkers, sheathless CE-ESI-MS provides improved sensitivity thatcan be effectively used as biomarker discovery tools.

The initial selection of an ESI-MS combination in selected systemsherein presented certain common and practical challenges. The use ofESI-MS as a detection method for CE imposes well-known restrictions onthe choice of buffer and capillary chemistry. For example, to minimizeblocking the inlet capillary of the MS with salt crystals and tominimize formation of salt adducts, only volatile components are used inthe separation buffer. For maximum sensitivity, components should beexcluded from the run buffer that compete with the analytes for chargein the electrospray, causing signal loss due to ion suppression.Furthermore, the composition of the buffer must be chosen so as tosupport stable electrospray at the given flow rate of the separation.Optimal choices for buffer components are water, volatile organics,(commonly acetonitrile or methanol) and volatile acids (commonly aceticor formic acid). When there is no sheath flow, the flow that supportsthe electrospray is supplied by the electro-osmotic flow (EOF) generatedin the separations capillary. Since the MS was operated in positive-ionmode, the inner surface of the separation capillary was modified withthe covalently-linked, hydrophilic, positively-charged coating poly(MAPTAC). Kelly, J. F., et al. have reported previously the utility ofthis coating chemistry for CE-MS of peptides in Analytical Chemistry1997, 69, 51-60. The fixed positive charge on the coating generates theelectro-osmotic flow, and it was expected that the combination of fixedpositive charge and hydrophilicity of the coating would minimizeadsorption of the primarily positively-charged components of serum.

As part of the sample preparation workflow, serum samples were de-saltedby adsorption on reverse phase material. After washing thereversed-phase material, the serum components were then eluted in 60-80%acetonitrile/0.1-0.5% acetic acid. Thereafter, performance of theseparations in an aqueous solution of acetic acid or formic acid andacetonitrile (0-40%) was first investigated.

Example 2

FIG. 10 illustrates how improved separations can result in improvedsignal output. In particular, FIG. 10 shows the separation data of amixture of seven polypeptides in acetonitrilic (bottom trace) andmethanolic (top trace) solutions. In each case, the concentration ofacetic acid was 50-70 mM. Electrophoresis was performed at 500 V/cm in a60 cm, 50 um ID poly-MAPTAC treated capillary. Detection was by UVabsorbance at 214 nm, 50 cm from the injection end. The composition wasas follows: (NM) 0.001X eCAP™ Neutral Marker, (1) neurotensin, (2)angiotensin I, (3) bradykinin, (4) carbonic anhydrase, (5) ribonucleaseA, (6) myoglobin, and (7) cytochrome c.

In FIG. 10, the seven polypeptides are separated approximately equallywell in both acetonitrile and methanol-containing solutions; however,the later-migrating proteins are better resolved in the methanolicsolution. A range of different concentrations of methanol (0-40%) andacetic acid (20-80 mM) was investigated for their ability to separate astandard set of peptides and proteins and for the stability ofelectrospray. It was found that using 20% methanol and 60 mM acetic acidgave the best combination of resolution, run-time, and electrosprayperformance.

To minimize concerns of sample-to-sample carry-over from adsorption ofserum components and to improve the reproducibility of migration timesfrom run-to-run, a capillary rinsing and conditioning procedure wasdeveloped and implemented. This procedure consists of rinsing thecapillary with alkaline and acidic solutions and then conditioning thesurface by flushing with water, dilute acid (60 mM acetic) and, finally,the separation buffer.

For the rinsing solutions, sodium hydroxide and hydrochloric acid wereused first just as other authors have used for separations of serumcomponents. Altria, K., Capillary Electrophoresis Guidebook: Principles,Operation, and Applications, Humana Press, Totowa, N.J. 1996; Paroni,R., et al., Electrophoresis 2004, 25, 463-468. However, it was foundthat even with the subsequent flushing steps, enough sodium and chlorideions were retained in the system to create detectable sodium andchloride adducts of serum components. To eliminate these undesiredadducts, sodium hydroxide and hydrochloric acid were replaced withammonium hydroxide (75 mM, pH 9.2) and formic acid (1.8 M, pH 1.6).

There are many choices for how to concentrate samples in-line in CE; forexample, field-induced sample stacking (Altria, K., CapillaryElectrophoresis Guidebook: Principles, Operation, and Applications,Humana Press, Totowa, N.J. 1996; Weinberger, R., Practical CapillaryElectrophoresis, Academic Press, Inc., San Diego, Calif. 1993) transientisotachophoresis (Foret, F., et al., Electrophoresis 1993, 14, 417-428;Larsson, M., et al., Electrophoresis 2000, 21, 2859-2865; Smith, R. D.,et al., Anal Chem 1990, 62, 882-899; Auriola, S., et al.,Electrophoresis 1998, 19), in-line reverse-phase chromatography columns(Tempels, F. W. A., et al., Anal Chem 2004, 76; Stroink, T., et al.,Electrophoresis 2003, 24, 897-903; Figeys, D., et al., NatureBiotechnology 1996, 14, 1579-1583), membrane preconcentration(Tomlinson, A. J., et al., J Capillary Electrophor 1995, 2, 225-233;Tomlinson, A. J., et al., J Am Soc Mass Spectrom 1997, 8, 15-24), etc.

The experiments performed herein provide the basis for selecting atransient isotachophoresis concentration method to improve sensitivity.The transient isotachophoresis (tITP) step was also selected for itssimplicity to concentrate relatively large injection volumes of serum.As a sample, the processed serum is complex and reasonably concentrated,containing many separable components detectable by UV absorbance (214nm). This is relevant because an in-line concentration step is appliedto maximize the number of dilute species that are detectable in abackground of more concentrated species.

Example 3

FIG. 4 demonstrates the tradeoff of signal gain and resolution for zoneelectrophoresis (ZE) versus tITP-ZE separations. Approximately 13-foldmore sample was loaded for the tITP-ZE separation, resulting in animprovement of ten- to fourteen-fold in signal. Electrophoresis wasperformed in 10-30% methanol/50-70 mM acetic acid at 500 V/cm in a 60cm, 50 um ID poly-MAPTAC treated capillary. Detection was accomplishedby UV absorption at 214 nm at 50 cm from the injection end. For the ZErun, sample was injected for 6 seconds at 1 psi. For the tITP-ZE run,sample was injected for 8 seconds at 9.5 psi, followed by an 8 second,9.5 psi injection of the stacking solution. The components of each at aflowrate of 10 ug/mL are as follows: (1) neurotensin, (2) angiotensin I,(3) bradykinin, (4) carbonic anhydrase, (5) myoglobin, (6) cytochrome c.For these analytes, the signal intensity increases approximatelyten-fold upon injecting 13 times more sample and a plug ofammonia-containing separation buffer. However, it was noted thatalthough the injected volume is stacked into a zone that gives rise topeaks that are fairly symmetrical, some resolution is lost.

A noted concern for this embodiment was whether for MS detection, thegain in total number of detectable and quantifiable species achieved byinjecting more sample was offset by ion suppression resulting from theloss of electrophoretic resolution between species. An absolute answerto this question may be ascertained with a devised algorithm that countsthe total number of species detected in a CE-MS run. In the absence ofthis algorithm during the development of this procedure, a series ofCE-MS experiments were performed in which the amount of sample injectedwas varied and performed either by ZE alone or by tITP-ZE. It was foundthat a modest (as much as five-fold) increase in signal, which variedfrom component to component, could be obtained by injecting a relativelylarge amount of sample and performing tITP-ZE. Accordingly, anotherpreferable embodiment of the invention provides a system that combinestransient isotachophoresis (tITP), capillary zone electrophoresis (ZE),electrospray ionization (ESI) and mass spectrometry (MS).

The ammonia concentration (20-80 mM) and the ratio of sample-to-stackingplugs were also investigated to determine conditions for a reasonableresolution and signal gain. It was found that for a 60-cm capillary, thebest signal gain with MS detection was obtained when the sample wasinjected for about 5 seconds at about 9.5 psi and the stacking solution(25 mM ammonium in 20% methanol/60 mM acetic acid, pH 4.7) was injectedfor about 5 seconds at about 4.8 psi.

FIG. 5( a) shows a comparison of the base peak intensity (BPI) trace forpooled human serum separated by ZE (lower trace) and that separated bytITP-ZE (upper trace). The signal displayed is relative to a value of100 for the maximum intensity in the data set. For the data in FIG. 5,the amount of injected serum and run conditions (applied voltage,capillary, buffer etc) were the same, except that in the tITP-ZEseparation, the injection of serum was followed by an injection of theammonium stacking solution as described in the CE-ESI-MS systemconditions noted above. By comparing the two BPI traces, narrower peaksare observed for the tITP-ZE separation.

FIG. 5( b) shows a comparison of the spectra where angiotensin I (m/z432.9) has its maximum intensity for the two separations shown in FIG.5( a). The spectrum for the ZE separation lies within that for thetITP-ZE separation. Angiotensin I was added to human serum beforeprocessing the serum. By extracting ion electropherograms for individualcomponents, we find that individual components typically have a narrowerpeak width and a higher signal in the tITP-ZE data. For example, themaximum intensity for angiotensin I (m/z 432.9, +3 charge state) isapproximately four times greater with tITP (˜2950) than without (˜720)((FIG. 5( b)).

It is believed that the mechanism of stacking is likely due to acombination of several effects. For example, the ammonium ion has afaster mobility than the serum components, and therefore the serumcomponents should stack against the boundary with the ammonium ions foras long as ITP conditions persist local to the sample zone.Additionally, the pH of the ammonium solution is higher than that of thesample, and therefore peptides that migrate through the boundary intothe ammonium zone may become less positively charged and slow, alsocausing the stacking to occur at the boundary with the ammonium zone.

The following three techniques were tested to apply the voltage to thefluid in the emitter: (1) the use of a distally coated emitter from NewObjective (2) the use of a stainless steel union to join the emitter andcapillary and (3) the use of a t-junction in which a platinum orpalladium wire was inserted perpendicular to the capillary-emitter axis.The metal union was easy to assemble and use; however, several undesiredcontaminant peaks were observed when performing CE-MS, and this washypothesized to arise from iron-acid interactions. Furthermore, thet-junction was found to be less robust than the distally coated emittersfrom New Objective. Emitters where the tip was drawn to a smaller innerdiameter at the end (SilicaTips) and emitters where only the external(outer) diameter is tapered (TaperTips) were utilized. Tips with innerdiameters of 8-30 um were prone to clogging. It was found that anexternally tapered tip with 50 um ID (equivalent to the ID of theseparations capillary) worked best. The internal surface of the emitterwas also cleaned and coated with poly(MAPTAC) to match the surfacecoating in the separations capillary. To extend the lifetime of theemitter to between one and five days of constant use, a carefulprocedure was developed to cut, trim and clean the emitter. Rinsing ofthe emitter with acetone to remove adherent material from the packagingand examining the emitter end for a clean, perpendicular cut with nodamage to the coating were found to be critical. For the best or optimalsignal observed, the emitter was positioned on-axis with the inletcapillary of the assembled mass spectrometer, and the tip was placedapproximately 1-5 mm from the MS inlet.

In the exemplary embodiments of the invention described herein, sampleswere run through a selected CE system before reaching the interfacebetween the capillary and the electrospray emitter. For sheathlesselectrospray interfaces as described elsewhere, the separationscapillary can be coupled directly to the electrospray emitter by meansof a junction or by fabricating the spray tip from the end of theseparations capillary. The spray voltage can be supplied either at thejunction or at the tip of the emitter. It was observed that when thespray voltage is applied to the tip end of a frontally coatedelectrospray emitter (SilicaTips, New Objective), frequent electricalarcing from the emitter to the metal curtain gas plate on the ABIMariner occurred. The arcing destroyed the conductive coating andrendered the emitter useless. Therefore, the frontally-coated emitterswere abandoned in favor of applying the voltage at the junction betweenthe separation capillary and the emitter.

Example 4

Experiments were performed to assess to what extent serum samples couldbe distinguished and classified based on patterns of componentintensities. A total of 76 CE-MS analyses were planned on 18 individualhuman serum samples and 8 pooled serum samples. Each sample was analyzedtwo to five times, in random order. Pooled serum samples were made bycombining an aliquot of each individual sample to eliminate effectscaused by biological variability between individuals. One of twospecific sets of 13 polypeptide standards in pre-determined amounts wereadded to each sample, creating two sample groups: A and B. The finalconcentration of each polypeptide in each sample group is given in Table3.

TABLE 3 Group A Group B Type Component nM nM Fold Pre-processingstandard Insulin β-chain 500 500 1 Ubiquitin 200 200 1 Post-processingstandard Lysozyme 100 100 1 Neurotensin 100 100 1 Pattern recognitionAngiotensin I 10 100 10 standard Angiotensin III 100 800 8 Aprotinin 50150 3 Bradykinin 100 200 2 Insulin 500 25 20 LHRH 150 750 5 fragmentMellitin 1000 100 10 Renin substrate 25 250 10 Substance P 1000 250 4Total Spiked Concentration: 2935 2625

Two components, neurotensin and lysozyme, were added after sampleprocessing and before CE-MS analysis as standards that could be used tocharacterize the performance of the CE-ESI-MS methodology. Thesecomponents, the post-processing standards, were added to a finalconcentration of 100 nM in each sample. All other peptides and proteinswere added before any processing was performed on the serum sample. Twoof these, ubiquitin and insulin β-chain, were added to each sample at200 nM and 500 nM, respectively, in the starting serum volume. The othernine peptides and proteins were added at different levels in Group Asamples than in Group B samples to emulate a different pattern ofpeptide concentrations between the two groups. The difference inconcentration of each of the nine ‘pattern recognition standards’between the two groups varied from two to twenty-fold. Theconcentrations in Group A and Group B were chosen so that similar totalmolar amounts of peptides were added to each group of samples.

The CE-MS runs were performed in an automated mode with analyticalsystems provided in accordance with other aspects of the invention. Eachof ten samples were loaded into an autosampler at a time. All of thepost-processing standards and pattern recognition standards were addedto the samples before the start of the experiment. The samples werestored at −20 C until they were run and in between repeat analyses. Atthe start of every day during experimentation, the system wasconditioned with three runs of a standardized serum sample, and then astandard set of ten peptides was run to monitor the separationperformance and signal intensity. If fluid wicked back along the emittertip, or if the signal could not be brought to within 10% of the typicalsignal for the set of ten peptides, the emitter was discarded andreplaced with a new one.

FIG. 6 represents the CE-MS data for human serum in a 2-D format,similar to that of a 2-D PAGE gel. Black regions of the illustrationgenerally correspond to relative high intensity. Each vertical segmentrepresents a single charge state of a component. Proteins can berecognized by their charge envelopes, which appear as a set of linesspaced in the m/z axis. Data was collected for an individual serumsample during the pattern recognition experiment. The illustrationprovided depicts one of the runs of individual sera displayed in a“pseudo-2D-gel” format, with m/z increasing from right to left, andseparation time increasing from top to bottom—relatively black regionsindicating high intensity and relatively white regions indicating zerointensity. However, unlike in a typical image of a 2-D protein gel, eachserum component in this separation may give one or more spots or lines,according to the number of charge states detected. When employing moreenhanced graphics to view results with even greater resolution,resulting images other than those shown herein as examples could furtherdisplay the isotopic resolution of the components.

In general, only one or two charge states are detected for smallerpeptides such as neurotensin, whereas multiple charge states areobserved for proteins, such as residual human serum albumin.

In FIG. 7, the migration time of neurotensin, one of the post-processingstandards, is plotted as a function of run order. The solid horizontalline denotes the mean value, and the dotted lines denote the bounds ofone standard deviation. The average migration time is 436.5+/−9 seconds.Most of the data lies within one standard deviation of the mean.Furthermore, the migration times are distributed more or less randomlywith run order, indicating that the tITP-ZE methodology is performingequivalently throughout the experiment.

It was investigated whether there was a correlation of the data with theday a sample was run. For the pre- and post-processing standards, whichare present in the same concentration in each sample, we calculated atotal intensity, akin to the area of a single-component peak in anelectropherogram. Where more than one charge state was detected for acomponent, the two most prevalent charge states were summed over. Thenthe total intensity against run order was plotted and no obviousgrouping of the intensities by day was found.

As described above, the pattern recognition standards were added to theserum samples such that the difference in their concentration betweenthe two groups spanned from 2- to 20-fold.

Example 5

FIG. 8 provides example data for Substance P, which was added intosamples in Group A at a 4-fold higher concentration than into samples inGroup B, is shown. The graph provided shows the mathematically averagedmass spectra for Group A (solid line) and for Group B (dotted line).Black circles on the x-axis identify the values of m/z determined to bedistinguishing features by our support vector machine (SVM)-basedfeature selection algorithms. These features are adjacent to each other(the black circles appear as a line) and correspond to the m/z for thefirst three isotope peaks of Substance P in its doubly charged state.The difference in average signal is easily discernable by eye.Immediately to the right of the isotope envelope for Substance P is anunidentified serum component (m/z 676.4), whose intensity was notsignificantly different between the two sample groups and was thereforeidentified correctly as a non-distinguishing feature.

To determine the fold-difference in concentration that was detectedamong the samples, the mean total intensities for each standard over allruns of Group A samples and the mean total intensities for each standardfor all runs of Group B samples were used. Then, for each standard, thetotal intensities of that standard in Group A were compared to those inGroup B by performing a student's t-test. The result of the t-test is ap-value which indicates the probability due to chance of the differencein means for Groups A and B. For example, if the p-value is 0.5, thereis a 50% chance that the observed difference in mean values is duepurely to chance and, hence, one would conclude that there is nostatistically significant difference between the means. Conversely, ap-value of 0.0001 indicates there is a statistically significantdifference between the means because there is only a 0.01% chance thatthis could have occurred by happenstance.

The following Table 4 shows the p-values for all standards analyzed, theobserved (detected) fold difference, and the expected fold difference inconcentration for all of the polypeptides added to the sera. Theobserved fold differences for the pre- and post-processing standardsrange from 1.05 to 1.30, close to the expected value of 1.0, as thesestandards are present at the same concentration in Group A and Group B.In particular, there was only a 5% difference between the mean totalintensities for neurotensin, and the p-value for this difference wasgreater than 0.5. Two of the post-processing standards, neurotensin andlysozyme, have p-values an order of magnitude higher than those of thepre-processing standards, ubiquitin and insulin β-chain. Therefore, itis likely that ubiquitin and insulin β-chain are more sensitive to anunidentified effect correlated to the two groups of samples (e.g. theadditional peptides spiked into each group). The significance of theseresults may be further considered with additional data.

TABLE 4 t-test Observed Expected Standard p-value Fold Foldpre-processing Insulin β-chain 0.04712 1.3 1 Ubiquitin 0.01436 1.3 1post-processing Lysozyme 0.33615 1.2 1 Neurotensin 0.71149 1.0 1 patternrecognition Angiotensin I 0.00001 7.6 10 Angiotensin III 0.00000 6.3 8Aprotinin 0.00003 1.9 3 Bradykinin 0.00000 1.6 2 Insulin 0.00000 13.4 20LHRH fragment 0.00000 4.5 5 Mellitin 0.08071 3.8 10 Renin substrate0.00000 7.8 10 Substance P 0.00000 3.4 4

As explained above, the p-values are less than 0.0001 for all patternrecognition standards except mellitin. Therefore, with the exception ofmellitin, the differences in mean total intensities between the groupsare statistically significant. There was a 1.6-fold difference in themean total intensities for Group A and B for bradykinin, which wasspiked in at twice the concentration in Group B than in Group A.Therefore, the system provided in accordance with this embodiment of theinvention is capable of detecting at least a two-fold difference in theaverage concentration of a component in two groups.

Example 6

The results in the preceding sections suggests that if a particularcomponent (a biomarker, for example) has at least a two-fold differentconcentration on average between the two groups, the difference can bedetected and quantified with reasonable accuracy and certainty. Adesired goal of the experimentation conducted was to determine whetherit was possible, without a priori knowledge of the markers, toautomatically identify the pattern recognition standards as those andonly those features which differentiate Groups A and B, and furthermore,whether classification of samples as belonging to Group A and Group Bwas possible using the pattern recognition algorithm.

The pattern recognition algorithm selected was based on the use ofsupport vector machines (SVM) on signal-processed data. (Boser, B. E.,et al., In Computational Learning Theory, 1992, pp 144-152; Christianni,N., et al., An introduction to support vector machines, CambridgeUniversity Press, 2000; Vlapnik, V., Statistical Learning Theory, JohnWiley and Sons, 1998.)

The result of signal processing was a single intensity vs. m/z spectrumfor each CE-MS run. The raw data was processed by first removing noisefrom the m/z spectra via wavelet transformation. (Donoho, D. L., Appliedand Computational Harmonic Analysis 1995, 2, 101-126). Then, theintensity for each m/z over all spectra collected during the run weresummed, effectively ‘collapsing’ the data over separation time.

After signal processing, support vector machines were used in aniterative manner to identify and select those features (i.e. m/z values)that differentiate Group A from Group B. The signal-processed data wasdivided into two sets: a “training set” and a “test set.” Within thetraining set, the data was sub-divided by group, since it is known whichsamples belong to Group A and which belong to Group B. The SVM algorithmwas then run on the training set. The result is a weights vector whichindicates the relative importance (weight) of each m/z indifferentiating Group A from Group B. Next, the training set of data was‘updated’ by taking the dot product of the weights vector and the rawdata. SVM is run on the updated data, forming a new weights vector. Theprocess of running SVM to form a new weights vector and updating thedata was repeated so that the only features (m/z values) retained arethose which best distinguish the groups. These features were theselected features that make up the distinguishing pattern.

The final step in this process was to classify a sample as belonging toeither Group A or Group B. To do this, all the original, raw data isreduced so that for each CE-MS run, the only intensities that remain inthe data set are those that correspond to the selected features. The SVMis run one last time with the data reduced in this manner to give theweights vector which may be used to classify samples (the classificationrule). All the samples in the test set are classified by forming the dotproduct of the classification rule with the reduced data for each sampleand examining the sign of the product. If the sign is positive, thesample belongs to Group A, and if negative, the sample belongs to GroupB.

To estimate how well data could be classified, a three-fold crossvalidation study was performed. Cross-validation based on multiple folds(groupings) is a statistical technique that has been shown to be areliable empirical method to estimate the error of an algorithm. Efron,B., J. Amer. Statist. Assoc. 1983, 78, 316-331; Stone, M., et al., J.Roy. Statist. Soc. 1974, 36, 111-147.

The data was randomly separated into three sets: 1, 2, and 3. Sets 1 and2 were combined to form the training set (as discussed above). Theremaining set, set 3, was the ‘test set,’ the set of data that would beclassified. In this way, the data used to develop the algorithm isindependent from that used to test the algorithm, and therefore thestatistics on the accuracy of the algorithm are more indicative of howthe algorithm performs on a much larger, more general data set. Stone,M., J. Roy. Statist. Soc. 1974, 36, 111-147. The process of featureselection and sample classification was repeated twice more so that eachof the three sets of samples was used as the test set, completing thethree-fold cross validation.

Table 5 below provides the results of the feature selection for thecomponents added to serum for each of the three sets of data.

TABLE 5 Type Component Set 1 Set 2 Set 3 Pre-processing standard Insulinβ-chain − − − Ubiquitin − − − Post-processing standard Lysozyme − + −Neurotensin − − − Pattern recognition Angiotensin I + + + standardAngiotensin III + + + Aprotinin − + + Bradykinin + + + Insulin + + +LHRH fragment + + + Mellitin + + + Renin substrate + + + Substance P + ++

A plus sign appears in the table where a component was identified as adistinguishing feature, and a minus sign appears where a component wasnot identified as a distinguishing feature. It is therefore expectedthat the minus signs for all the table entries for pre- andpost-processing standards, as those components were added to Group A andGroup B samples in equivalent amounts. It would also be expected thatplus signs in the rows for the pattern recognition standards, as theconcentrations of these components differed between the groups. Out ofthe three sets of data and the nine pattern recognition standards, inonly one instance (aprotinin in set 1) was a pattern recognitionstandard not identified as a distinguishing feature. In only oneinstance also (lysozyme in set 2), a post-processing standard wasidentified as a distinguishing feature.

Using the classification rule based on identified features, the samplesin each of the three test sets were assigned to either Group A or GroupB. The accuracy obtained was determined to be approximately 94%.

Example 7 Samples

Individual human serum samples were obtained from Golden West Biologics(Temecula, Calif.).

Samples were prepared by adding thirteen polypeptides as mock biomarkersat pre-determined levels to two groups of human sera. Because thetargets of the biomarker discovery experiments herein were peptides andsmall proteins, a procedure was developed to deplete the serum ofproteins larger than 50,000 MW. This step effectively removed themajority of the high abundance proteins such as serum albumin andimmunoglobulins G which could have overwhelmed the lower abundancepeptides of interest. Eight proteins alone constitute approximately 90%of the 60-80 milligrams of protein per milliliter of serum (Burtis, C.A., et al., Tietz Textbook of Clinical Chemistry, W. B. SaundersCompany, Philadelphia, Pa. 1999; Putnam, R. W., The plasma proteins,Academic Press, New York 1975); and therefore the high-abundanceproteins are of less interest. This procedure also effectively de-saltsthe sample to reduce the conductivity of the sample and to avoid thepossible formation of salt adducts in the electrospray.

The procedure consisted of diluting 50 μL of human serum ten-fold andfiltering the diluted serum through an Amicon YM50 (MilliporeCorporation, Billerica, Mass.) molecular weight cut-off membrane atabout 14,000 g for 10 to 40 minutes at room temperature. Aftercentrifugation, 15 to 35 μL of 5-12% trifluoroacetic acid was added tothe filtrate, and the filtrate was loaded onto a pre-equilibrated, C8reverse-phase Optiguard guard column (Optimize Technologies, OregonCity, Oreg.) at 70-90 μL/min. The column was washed with 150-250 μL of3-7% acetonitrile/0.1-0.5% acetic acid to remove salt, and the serumcomponents are eluted with 15-25 μL of 60-80% acetonitrile/0.1-0.5%acetic acid. The column may be re-used after rinsing with 90-99%acetonitrile and equilibrating with 3-7% acetonitrile/0.1-0.5% aceticacid.

Materials

Various materials and reagents were selected and obtained from differentsources such as the following: glacial acetic acid (99+%), formic acid(96%), 5.02 N ammonium hydroxide volumetric standard, ammoniumpersulfate (APS), 7-oct-1-enyltrimethoxysilane,3-methacryloylaminopropyl trimethylammonium chloride (MAPTAC), andN,N,N′,N′,-tetramethylethylenediamine (TEMED), human angiotensin I,angiotensin III, bovine lung aprotinin, bradykinin, bovine heartcytochrome c, bovine pancreatic insulin β-chain (oxidized), bovinepancreatic insulin, chicken egg white lysozyme, luteinizing hormonereleasing hormone fragment 1-6 amide, melittin, equine skeletalmyoglobin, neurotensin, porcine N-acetyl renin substratetetradecapeptide, substance p, and bovine erythrocyte ubiquitin werepurchased from the Sigma-Aldrich Company (St. Louis, Mo.). GC-MS grademethanol, HPLC-grade acetonitrile, high purity acetone and HPLC-gradewater were obtained from Honeywell Burdick and Jackson (Muskegon,Mich.). Trifluoroacetic acid and 10 M sodium hydroxide were obtainedfrom J T Baker (Phillipsburg, N.J.). eCAP™ Neutral Marker was obtainedfrom Beckman Coulter, Inc. (Fullerton, Calif.) and diluted 100-fold inacetonitrile.

Results

The efficacy of this procedure was determined using HPLC with UVdetection. More than 99% of the high abundance proteins were removed. Togain an additional measure of the recovery of lower molecular weightpeptides, a set of standard peptides was added to the serum at a knownconcentration. Recovery of endogenous and spiked peptides varied bypeptide; in general, endogenous peptides were recovered at more than 70%(range: 65%-100%) and spiked peptides were recovered at more than 85%(range: 70-100%) (data not shown).

Example 8

A 50 μL sample of human serum is processed with or without the additionof 5 μL pepstatin A (a 1 mM solution of pepstain A prepared in methanoldiluted 1:10 in water). Samples with and without pepstatin are added to50 μL of 10% formic and the sample is diluted to 500 μL with water andadded standards if desired. Each sample was passed over a gradient C18column using an acetonitrile gradient and monitored at 215 nM in anAgilent™ 110 as shown in FIG. 15. Examples of affected components areillustrated in FIG. 15 as indicated by the arrows.

A serum sample was processed with or without 0.1 μM pepstatin A asdescribed above and each sample was infused by electrospray usingNanomate™ instrument (Advion, Inc.) linked to a QStar™ mass spectrometerwith the results shown in FIG. 16( a) (without pepstatin) and 16(b)(with pepstatin). A component affected by the addition of pepstatin isindicated with an arrow.

Example 9

Microfluidic-based capillary electrophoresis-mass spectrometry was usedto identify prostate cancer markers. The objective was to find patternswhich differentiate those individuals with prostate cancer from thosewithout in subjects with a PSA value between 1-6 ng/ml.

Study Design

Samples were divided into discovery and validation sets. Data wascollected from both sample sets concurrently. Data from the discoverysamples was used to find a biomarker pattern, and data from thevalidation samples was used to evaluate how well the pattern candistinguish between the two groups of men (i.e. the validation data setwas not used for training or testing in discovery cross-validation).Data was analyzed from each site's samples independently and thenevaluated for overlap between the results. Table 6 provides adescription of the samples and FIG. 17 provides a schematic overview ofthe samples.

Half of the 200 samples shown in FIG. 17 were used for Discovery ofpatterns, as described above. These included 25 case and 25 controlsamples from site A and 25 case and 25 control samples from site B.Following pattern discovery, the second half of the 200 samples shown inFIG. 17 were used for validation of the patterns. Validation consistedof determining whether, for each sample, a pattern correctly identifiesthe sample as prostate cancer (case) or non-prostate cancer (control),using the decision function, D, described above.

TABLE 6 Sites Sample Site A Site B Disease Cases 50 50 Control Cases 5050

Sample Analysis

Serum samples were prepared, separated, and introduced into a massspectrometer for analysis. Preparation included the removal of highabundance proteins, addition of preservatives and calibrants, anddesalting. Prepared samples were then separated using microfluidic basedcapillary electrophoresis (CE) in a ˜12 minute separation. Using anelectrospray ionization (ESI) interface, samples were ionized andsprayed directly into a time-of-flight mass spectrometer (MS). Theresulting CE-MS data for each sample was a series of mass spectra,acquired during the electrophoretic separation. Samples were preparedand analyzed in a randomized order to minimize biases.

Sample Criteria

Samples were collected pre-biopsy and pre-treatment, and samples werecollected either before or after DRE. If a DRE had been performed,samples were collected at least 24 hours post-DRE.

Matching of cases and controls was done based on site, PSA levels, ageat sample collection, date of sample draw, and race, in that order ofpriority.

A volume of approx. 10 cc of venous blood was drawn in serum tubes (“redor marble” top glass tube, BD Vacutainer. After sitting for minimum of30 minutes to a maximum of 12 hrs the sample was centrifuged and theserum was collected and frozen (−80° C.).

Approximately 200 μL of serum was required for analysis from eachpatient.

TABLE 7 Inclusion and Exclusion Criteria Cases Objective InclusionExclusion 1 PSA values in the 1-6 ng/ml Prior to entering this studyrange who have a history of any other cancer, confirmed diagnosis otherthan non-melanoma of prostate cancer. skin cancer. Reasons for biopsy of<40 years old these individuals may Samples that have undergone includerising PSA, more than 1 freeze/thaw abnormal DRE, or high-risk cycle.status (e.g., family history of prostate cancer).

Prostate cancer diagnosis was based on pathological analysis of at leastone 6-core TRUS guided biopsy.

To be considered a control, patients had at least one 6-core TRUS guidedbiopsy that did not find evidence of prostate cancer.

Control Samples

Spiked serum A was a control run at the beginning of each day. Thisconsisted of serum that had been processed following the standard sampleprep protocol and spiked with components at specific concentrations postprocessing. Composition can be found in Table 8.

TABLE 8 Spiked Serum A components Concentration (nM) Effectiveconcentration in Actual concentration Standard unprocessed serum inresuspended serum Pre-Processing 100 1000 Ala-met enkephalinPost-Processing LHRH fragment 300 3000 Bradykinin 300 3000 AngiotensinIII 300 3000 Ubiquitin 300 3000 Aprotinin 300 3000 Renin 300 3000Neurotensin 50 500

Sample Preparation and Data Collection

Each sample was prepared 4 times and run 2 times on the CE-MS.

The 200 samples were prepared four times each. The 4 replicates of eachprepared sample were pooled and re-divided into 4 aliquots. Two of thosealiquots were used in CE-MS.

The standard sample preparation is outlined in FIG. 18. The compositionof Sample Standard was 0.30 μM angiotensin III and 10.0 μL Aprotinin andSample Diluent was 390 μL, HPLC water, 50 μL 10% formic, 5 μL Pepstatin1:10 in H₂O, 5 μL Sample Standard.

Samples were thawed sample for the run at room temperature andtransferred to ice at once when thawed. Runs were set up in duplicate oneach of two μElute plate (n=4 each sample). All samples were runindividually. 450 μL of sample diluent was added to 50 μL of serumsample and mix. Diluted samples were transferred immediately to YM50Microcon (within ten minutes) and centrifuged at 13,000×g for 30 minutesin the centrifuge with 45° angle black anodized rotor. 25 μL 10%trifluoroacetic acid was added just before application to reverse phase.Samples were processed on μElute plate and collected in PCR plate.Samples were dried in the vacuum centrifuge. Aliquots were re-suspendedwith 5 μL of re-suspension buffer of IPA and formic containingpost-processing standard, bradykinin and renin at 3000 nM actualconcentration in resuspended serum. Samples were vortexed for twominutes and centrifuged for 10 sec. After sample preparation the 4separate preparations were pooled and re-aliquoted.

The mass spectrometer was set up with the inlet capillary voltage to280, PMT bias to −770, and MCP bias to −6000 in the volts window. Thescan range was set to 122496, Number of Scans to 8000, Acq. Bin Width to1 and threshold to 35. The spiked serum sample was run in the CE-MS toverify the intensities, resolution and migration times for thestandards.

The mass spectrometer was rinsed with sample and then loaded with a chipof 1 μM set 6 in 20% IPA, 0.05% formic acid for chip infusion. A singleuse vial is run of set 6 1 μM in 20% IPA 0.05% forming acid for chipinfusion. After the pre-run is complete, the signal and resolution ofthe 1 μM neurotensin³⁺ peak at 558.3 m/z is monitored. The inlet lensvoltage is adjusted in 0.05 V increments to obtain the optimum countsand resolution for neurotensin³⁺ (signal intensity: ≧150,000 counts;resolution: 6000-8000). When the intensity and resolution fall withinthese limits, another Spiked Serum A was run.

Sample runs: Samples are removed from −20° C. freezer and stored on iceduring CE-MS runs for no longer than 4 hours. One sample is used tocomplete 1 CE-MS run and obtain the data. During sample runs, sprayswere visually inspected for stability.

Data Analysis

CE-MS data were analyzed several ways after data quality assurance.Peaks were identified using several methods, includingmass-spectrometry-specific signal processing methods. First, univariatestatistics were used to find single peak and/or component intensitiesthat correlate with the presence/absence of prostate cancer. Standardnon-parametric methods were used due to small sample size and theinability to assume normality of data. Such methods include theMann-Whitney test. Second, after ranking by P-value, results werevisualized, and those peaks/components that have high group-meandifferences were determined. Third, a suite of feature selection andpattern classification methods were used to find multi-variate patternsthat distinguish between the presence and absence of prostate cancer.These methods include support vector machines, discriminant analysis,and other machine learning methods. Cross-validation techniques wereutilized to train and test patterns. The sensitivities, specificitiesand positive/negative predictive values of patterns that can highlydiscriminate between classes were determined. Proteomic data wereanalyzed with and without PSA scores and other clinical measurementsavailable.

The markers identified are shown in Tables 9 and 10A-10D below.

TABLE 9 Biomarker (*molecular weight for the indicated monoisotopicSeparation entities is Molecular Time (sec) up or down as shown Observedm/z monoisotopic* or Weight (+/−64 sec for regulated in or +1 dalton)Charge (thomson) average for m/z (Daltons) 95% CI) cancer cells 1* 12.9511E+02 monoisotopic 294 214 down 2  9 1.5433E+03 average 13880 452up 10 1.3890E+03 average 13880 452 11 1.2629E+03 average 13880 452 121.1577E+03 average 13880 452 13 1.0687E+03 average 13880 452 149.9246E+02 average 13880 452 15 9.2636E+02 average 13880 452 168.6852E+02 average 13880 452 17 8.1749E+02 average 13880 452 187.7213E+02 average 13880 452 19 7.3155E+02 average 13880 452 206.9502E+02 average 13880 452 21 6.6197E+02 average 13880 452

TABLE 10A Biomarker (*molecular weight for the indicated monoisotopicSeparation entities is as Time (sec) up or down shown or +1 Observed m/zmonoisotopic* or Molecular Weight (+/−64 sec for regulated in dalton)Charge (thomson) average for m/z (Daltons) 95% CI) cancer cells 3 25.2576E+02 monoisotopic 1050 230 down 4 1 5.2035E+02 monoisotopic 519192 down 2 2.6067E+02 monoisotopic 519 192 5 8 1.1336E+03 average 9061708 up 9 1.0077E+03 average 9061 708 10 9.0707E+02 average 9061 708 6 41.0513E+03 monoisotopic 4201 341 up 5 8.4127E+02 monoisotopic 4201 341 7* 1 4.9723E+02 monoisotopic 496 279 down 8 3 1.1113E+03 monoisotopic3331 452 up 4 8.3369E+02 monoisotopic 3331 452 5 6.6715E+02 monoisotopic3331 452 9 3 7.2164E+02 monoisotopic 2162 495 up 4 5.4148E+02monoisotopic 2162 495 10  6 1.0291E+03 average 6169 452 up 7 8.8222E+02average 6169 452 8 7.7207E+02 average 6169 452 11  4 8.2773E+02monoisotopic 3307 331 up 12  7 1.3279E+03 average 9288 643 up 81.1620E+03 average 9288 643 9 1.0330E+03 average 9288 643 10 9.2982E+02average 9288 643 13  7 1.1050E+03 average 7728 400 up 8 9.6701E+02average 7728 400 9 8.5967E+02 average 7728 400 14  7 1.3279E+03 average9289 633 up 8 1.1621E+03 average 9289 633 9 1.0331E+03 average 9289 63310 9.2986E+02 average 9289 633 15  4 8.0696E+02 monoisotopic 3224 564 up5 6.4576E+02 monoisotopic 3224 564 16  1 7.6536E+02 monoisotopic 764 235down 2 3.8318E+02 monoisotopic 764 235 17* 1 6.1935E+02 monoisotopic 618265 up 18  6 9.5430E+02 average 5720 483 up 7 8.1812E+02 average 5720483 8 7.1598E+02 average 5720 483 9 6.3653E+02 average 5720 483

TABLE 10B Biomarker (*molecular weight for the indicated monoisotopicSeparation entities is as Time (sec) up or down shown or +1 Observed m/zmonoisotopic* or Molecular Weight (+/−64 sec for regulated in dalton)Charge (thomson) average for m/z (Daltons) 95% CI) cancer cells 19 26.9929E+02 monoisotopic 1397 246 up 20 12 9.5422E+02 average 11439 482up 13 8.8089E+02 average 11439 482 14 8.1804E+02 average 11439 482 157.6357E+02 average 11439 482 16 7.1591E+02 average 11439 482 176.7386E+02 average 11439 482 18 6.3648E+02 average 11439 482 21 131.0812E+03 average 14043 451 up 14 1.0040E+03 average 14043 451 159.3718E+02 average 14043 451 16 8.7867E+02 average 14043 451 178.2704E+02 average 14043 451 18 7.8115E+02 average 14043 451 197.4009E+02 average 14043 451 22 3 5.4295E+02 monoisotopic 1626 470 up 44.0747E+02 monoisotopic 1626 470  23* 1 3.3413E+02 monoisotopic 333 296up 24 13 1.0569E+03 average 13727 455 up 14 9.8152E+02 average 13727 45515 9.1615E+02 average 13727 455 16 8.5896E+02 average 13727 455 178.0849E+02 average 13727 455 18 7.6363E+02 average 13727 455 197.2349E+02 average 13727 455 25 14 9.9214E+02 average 13876 494 up 159.2607E+02 average 13876 494 16 8.6825E+02 average 13876 494 178.1723E+02 average 13876 494 18 7.7189E+02 average 13876 494  26* 12.2911E+02 monoisotopic 228 193 down  27* 1 3.2712E+02 monoisotopic 326194 up 28 2 4.8368E+02 monoisotopic 965 199 up  29* 1 2.5715E+02monoisotopic 256 199 down 30 1 6.2533E+02 monoisotopic 624 306 up 23.1316E+02 monoisotopic 624 306 3 2.0911E+02 monoisotopic 624 306 31 24.4813E+02 monoisotopic 894 235 down

TABLE 10C Biomarker (*molecular weight for the indicated monoisotopicSeparation entities is as Time (sec) up or down shown or +1 Observed m/zmonoisotopic* or Molecular Weight (+/−64 sec for regulated in dalton)Charge (thomson) average for m/z (Daltons) 95% CI) cancer cells 32 18.5739E+02 monoisotopic 856 235 down 2 4.2920E+02 monoisotopic 856 23533 7 1.7797E+03 average 12451 373 up 8 1.5574E+03 average 12451 373 91.3845E+03 average 12451 373 34 3 6.1932E+02 monoisotopic 1855 328 up 3510 1.1739E+03 average 11729 601 up 11 1.0673E+03 average 11729 601 129.7840E+02 average 11729 601 13 9.0322E+02 average 11729 601 148.3878E+02 average 11729 601 36 13 1.0700E+03 average 13897 451 up 149.9366E+02 average 13897 451 15 9.2748E+02 average 13897 451 168.6957E+02 average 13897 451 17 8.1848E+02 average 13897 451 187.7307E+02 average 13897 451 19 7.3243E+02 average 13897 451 206.9586E+02 average 13897 451 37 11 1.2593E+03 average 13841 443 up 121.1544E+03 average 13841 443 13 1.0657E+03 average 13841 443 149.8967E+02 average 13841 443 15 9.2376E+02 average 13841 443 168.6609E+02 average 13841 443 17 8.1520E+02 average 13841 443 187.6997E+02 average 13841 443 19 7.2949E+02 average 13841 443

TABLE 10D Biomarker (*molecular weight for the indicated monoisotopicSeparation entities is as Time (sec) up or down shown or +1 Observed m/zmonoisotopic* or Molecular Weight (+/−64 sec for regulated in dalton)Charge (thomson) average for m/z (Daltons) 95% CI) cancer cells 38 111.2717E+03 average 13978 452 up 12 1.1659E+03 average 13978 452 131.0762E+03 average 13978 452 14 9.9944E+02 average 13978 452 159.3288E+02 average 13978 452 16 8.7464E+02 average 13978 452 178.2325E+02 average 13978 452 18 7.7757E+02 average 13978 452 39 61.1060E+03 average 6630 585 up 7 9.4818E+02 average 6630 585 88.2978E+02 average 6630 585 9 7.3769E+02 average 6630 585 10 6.6402E+02average 6630 585 11 6.0375E+02 average 6630 585  40* 1 6.8650E+02monoisotopic 686 195 up  41* 1 3.1314E+02 monoisotopic 312 305 up 42 27.3335E+02 monoisotopic 1465 266 down 3 4.8924E+02 monoisotopic 1465 2664 3.6718E+02 monoisotopic 1465 266 43 2 4.9167E+02 monoisotopic 981 198up 44 1 9.4442E+02 monoisotopic 943 198 up 2 4.7271E+02 monoisotopic 943198  45* 1 2.7310E+02 monoisotopic 272 192 down  46* 1 229.1146625monoisotopic 228 337 down  47* 1 342.145859 monoisotopic 341 440 up

The above examples are in no way intended to limit the scope of theinvention. Further, it can be appreciated to one of ordinary skill inthe art that many changes and modifications can be made thereto withoutdeparting from the spirit or scope of the appended claims, and suchchanges and modifications are contemplated within the scope of theinstant invention.

Example 10

In one embodiment, deciding whether a test sample comes from a patientthat has prostate cancer is computed as follows:

Identify the intensity levels for every marker in Table 6 for everyreference sample and for the test sample. The reference samples arethose samples defined in the study design. Sum together the intensitiesfor all charge states for a given biomarker. This yields a set of summedintensities, two intensities for every sample. Let the intensities forthe test sample be identified by T=(biomarker 1 intensity for testsample, biomarker 2 intensity for test sample). Let the intensities foreach of the reference samples be identified by R(i)=(biomarker 1intensity for sample i, biomarker 2 intensity for sample i).

A comparison between the test sample, T, and reference sample, R(i), isdone by taking a dot product between the two:

(T*R(i))=(biomarker 1 intensity for test sample)*(biomarker 1 intensityfor sample i)+(biomarker 2 intensity for test sample)*(biomarker 2intensity for sample i)

A decision function, D, is made from these comparisons by computing afunction that appropriately weights them:

D=(\sum\\alpha_(—) i*(T*R(i)))+b

The alpha_i and b parameters are numbers that are appropriate fordeciding whether the patient has prostate cancer based on the referencesamples.

The decision is made that the patient has prostate cancer if thefunction D is greater than 0 and that the patient does not have prostatecancer if the function D is less than or equal to 0.

What is claimed is:
 1. A method comprising: a) collecting more than 10case samples representing a clinical phenotypic state and more than 10control samples representing individuals without said clinicalphenotypic state; b) using electrophoresis followed by a massspectrometry platform system to obtain mass spectral components in saidcase samples and in said control samples without regard to a specificsequence of at least some of said mass spectral components; c)identifying in a computer system representative patterns of markers thatdistinguish datasets from case samples and control samples wherein saidpatterns contain more than 15 markers that are represented on output ofsaid mass spectrometer, but the specific sequence of said more than 15markers is not known; d) from blood samples of patients, in a computersystem, identifying in patient samples said more than 15 markers whereinthe specific sequence of said more than 15 markers is not known.
 2. Themethod as recited in claim 1, wherein diagnostic products are marketedusing said markers in a clinical reference laboratory.
 3. The method asrecited in claim 1, further comprising the step of collecting saidsamples in collaboration with a collaborator.
 4. The method as recitedin claim 3, wherein said collaborator is an academic collaborator. 5.The method as recited in claim 3, wherein said collaborator is apharmaceutical company.
 6. The method as recited in claim 5, whereinsaid pharmaceutical company collects said samples in a clinical trial.7. The method as recited in claim 1, wherein data from one of saidsamples are being processed computationally while another of saidsamples are in said mass spectrometry platform.
 8. The method as recitedin claim 1, wherein said markers are polypeptides.
 9. The method asrecited in claim 8, wherein said patterns contain more than 30polypeptides that are represented on output of said mass spectrometer,but the specific sequence of said more than 30 polypeptides is notknown.
 10. The method as recited in claim 8, wherein said patternscontain more than 50 polypeptides that are represented on output of saidmass spectrometer, but the specific sequence of said more than 50polypeptides is not known.
 11. The method as recited in claim 8, whereinsaid patterns contain more than 100 polypeptides that are represented onoutput of said mass spectrometer, but the specific sequence of said morethan 100 polypeptides is not known.
 12. The method as recited in claim8, wherein said samples contain more than 1000 polypeptides that arerepresented on output of said mass spectrometer, but the specificsequence of said more than 1000 polypeptides is not known.
 13. Themethod as recited in claim 1, wherein more than 50 of said cases samplesand 50 of said control samples are used.
 14. The method as recited inclaim 1, wherein more than 100 of said case samples and 100 of saidcontrol samples are used.
 15. The method as recited in claim 1, whereinsaid diagnostic products use said mass spectrometry platform.
 16. Themethod as recited in claim 1, wherein said step of using a massspectrometry platform is preceded by the step of preparing said sampleson a microfluidics device.
 17. The method as recited in claim 16,wherein said diagnostic products are marketed with a disposablemicrofluidics device, said disposable microfluidics device processingdiagnostic samples for use in said mass spectrometry platform.
 18. Themethod as recited in claim 16, wherein said microfluidics devicecomprises a separations device.
 19. The method as recited in claim 1,wherein said mass spectrometry platform is a time of flight massspectrometer.
 20. The method as recited in claim 1, wherein said massspectrometer is a Hadamard time of flight mass spectrometer.
 21. Themethod as recited in claim 1, wherein said diagnostic products aremarketed by a diagnostic partner.
 22. The method as recited in claim 1,wherein said phenotype is a disease diagnostic phenotype.
 23. The methodas recited in claim 16, wherein said microfluidics device comprises anelectrospray source.
 24. The method as recited in claim 1, wherein saidsamples contain complex mixtures of polypeptides.